I know there are numerous texts on the PE file format, but I can't help my head from spinning a bit. =) What confuses me the most is, when the program is loaded into memory, how are the memory points calculated? For example: mov eax,offset SomeString Couldn't be properly calculated because there's no way to know where SomeString will be in memory when the program is loaded. I would've thought to do a simple equation of RVA - Base + Base Offset, but which Base section do I use? Code, data? And is there anyway to know for sure? Sorry if this is too easy of a question, but I'm just totally lost! Thanks!
Posted on 2000-12-04 20:56:00 by Racso
Rasco, Normally string data is in the .DATA section but you can code string data into the .CODE section as well with a technique as simple as, jmp lbl MyString db "This is my string",0 lbl: Then call the string in the normal manner. If you open the compiled EXE file in a HEX editor, you can easily find both the .DATA section and any string that you may put in the code section. Assemblers/Compilers do the conversion of the name to ADDRESS so that when your code referes to "MyString" it has the ADDRESS of WHERE it is in the file. Regards, hutch@pbq.com.au
Posted on 2000-12-05 05:20:00 by hutch--
I know how the assembler obtains addresses for data. And yes, I can find the data quite easily, but I'm not going to be the one doing the finding. =) And since I can't load the program to 400000h (believe me, I tried), I would like to know how to find the actual RVAs of the data. (With the RVAs, it wouldn't matter if my data was in the CODE or DATA sections then.) Also, for some reason, the base relocation tables don't seem to exist in the majority of compiled Windows programs. (Or at least I think that they don't exist, because their RVA and Virtual Size are both 0).
Posted on 2000-12-05 05:51:00 by Racso
All RVAs in a PE file are relative to the module base address, ie. the module load address. Each module can have only one load address so it doesn't matter in which section the data/code that needs relocation is. Most EXEs don't contain reloc information because the first module that is loaded into the newly created process address space is the main EXE. Thus it has the first shot at any address within the private address space. That means IF the preferred load address is within valid range, eg, between 4MB-2GB, the EXE will never need to be relocated. The situation with a DLL is different. When a DLL is loaded, its preferred address may already be occupied by other module. Thus a DLL requires reloc information.
Posted on 2000-12-05 22:39:00 by Iczelion
Yeah, I read that in Microsoft's documentation about an hour ago. Oh well, I figured out how to do 'relocation' without having a relocation table, but thanks for everyone's help.
Posted on 2000-12-06 00:33:00 by Racso