I've been going through the PE file format documentation, and it struck me that a PE executable should not require a relocation table. If the loader can fix up relocations, can't one write a utility that does them in the file, and then discards the relocation section? From Vancouver, Larry
The linker assumes that the image will be loaded at a certain base address (like 0x00400000 in Win98). If the loader loads the file at this address, no relocation are needed and the relocation section is ignored. But the loader can decide to load the image at a different base address. In this case the relocation table is used to fix the offsets in the file.
EXE PE do not need any Relocation in any case because the default adress is not real but virtual. SpAsm produce PE have no relocation and they work as well, even with multi- instances of the PE runing. DLL PE need a relocation for data because, for several EXE make call to the same dll and that this dll stores some data for each PE, an image of these data are wished for each client PE. betov.
I don't understand what you mean by virtual address instead of real address. Here is an extract from the PE article in the MSDN :
The .reloc section holds a table of base relocations. A base relocation is an adjustment to an instruction or initialized variable value that's needed if the loader couldn't load the file where the linker assumed it would. If the loader is able to load the image at the linker's preferred base address, the loader completely ignores the relocation information in this section. If you want to take a chance and hope that the loader can always load the image at the assumed base address, you can tell the linker to strip this information with the /FIXED option. While this may save space in the executable file, it may cause the executable not to work on other Win32-based implementations. For example, say you built an EXE for Windows NT and based the EXE at 0x10000. If you told the linker to strip the relocations, the EXE wouldn't run under Windows 95, where the address 0x10000 is already in use.It seems that the relocation table must always be present. I have never used SpAsm. Could you give more explanations please ?
Larry, For PE reference, get these two articles. http://www.pbq.com.au/home/hutch/pefile.htm An EXE file is usually loaded at 400000h so it does not need relocation but a DLL can be loaded at different addresses depending on wha address range is available, thats why it requires the relocation data present. Most EXE packers remove the relocation data from EXE file but leave it there in a DLL. Regards, email@example.com
Karim, The Adresses of code you can see inside a runing PE are not the physical adresses. They are virtual (done by 'Paging'.) Let's say you have a PE without relocation section. and this PE inputs a value from user and writes it in a variable named 'Var1'. You can run this PE twice and input in first image let's say 10 and in the second image, 20,... everything's OK because the physical adresses of Var1 in image1 and of Var1 in Image2 are not at the same physical adresses. betov.
Karim, You need to understand couple of basic terms: - Base of image. - RVA - Mapping memory into current process addess space. Each process's virtual address space is split into partitions. The address space is partitioned based on the underlying implementation of the operating system. Partitions vary slightly among the different Windows kernels. ------------ It sound at least unexplicit 'An EXE file is usually loaded at 400000h so it does not need relocation'. Image base is a value in PE file, it put there by link.exe. If it's not specified by /BASE option it'll be 40 0000h for exe file. It can be change with the /BASE opion OS loader loads exe image not 'usually to 400000h' it loads it to virt. address specified by the Image Base value which is usually (by default) 400000h. And if only the address is not available it needs relocate it. The simple question is - how address may come unavaileble? There are two possible and very simple reasons. 1. The Image Base pointed to address range that not to be used for user code. Here is place I need to put some table shown such ranges and their purpose for NT and 9x. --------- to Hero: Is it possible to draw tables in your message board? --------- 2. The second reason is right for DLLs, they also has the ImageBase value but they a mapped into runtime into process virtual space so that if some of them has the same VA and both called by the process the only one of dll with the same VA may be mapped into process memory on predefined addresses, the rest with the same address need to be relocated.
OK ! So for an executable, if the image base is in a valid address range for the OS (like 0-2GB under NT), the loader can always use the preferred base address and doesn't need a relocation table. Thanks to all of you for your explanations This message was edited by karim, on 5/12/2001 1:32:59 PM
-------Quote-------------------------- OK ! So for an executable, if the image base is in a valid address range for the OS (like 0-2GB under NT), the loader can always use the preferred base address and doesn't need a relocation table --------------------------------------------- Not exactly. Talking of NT as 32bit NT 5 (without /3GB key) I can put here address partions table (for 9x it'll be slightly different) Partion AddrRange NULL-Pointer 0x00000000h Assignment 0x0000FFFFh -------------------------------------------------- UserMode 0x0001000h 0x7FFEFFFFh -------------------------------------------------- 64kb 0x7FFF0000h OFF - limit 0x7FFFFFFFh -------------------------------------------------- KernelMode 0x80000000h 0xFFFFFFFFh -------------------------------------------------- So for NT5 32 without /3GB for exe you can use only memory range 0x0001000h 0x7FFEFFFFh to be sure that it would be loaded at the ImageBase value address. The Svin
ok it's clear now thanks a lot !
Thanks guys...but... I've written PE's using my own Rube Goldberg linker and they have no reloc table and yet they _are_ relocatable, i.e. they will load anywhere. (I give them an arbritrary image base of 1000000h, and my machine doesn't have that much memory, but they work.) I still think the reloc table is just an artifact of less-than-perfect linking. Off the topic, try mov ax,ds: in a Windows asm program. Sure 'nuff, you get "MZ" in ax! Larry
-----------Quote--------------------------- (I give them an arbritrary image base of 1000000h, and my machine doesn't have that much memory, but they work.) ------------------------------------------------ What a difference how many phisical memory do you have? It's not a phisical address it's virtual one. Topic: Use a debugger a trace to some kernel mode API function - you'll see some adresses of instructions that more than 10 times bigger than your phisical memory amount. And in user mode Win32 mov ax,ds will give you not MZ but access violation. And I have no idea what you call 'relocatable'? And what your experiment with 10000000h image base proved. Man, you need to feel difference between virtual and real addressing. Relocation is not about moving form one phisical address to another but from one virtual adress to another. Talking of DLL in most cases you not even load it to your process, you just mapping it in your process memory, if it's API DLL most likely it's already loaded. The Svin.
Sorry, Svin, I was in a rush and I should have been clearer. A PE may have no relocation table, that is, just a pair or dword zeros in the list of data directories: Export Table Import Table ... Base Relocation Table ;also called Fixup Table ... and yet the program will always run whether or not its preferred base address is available. I guess that means it's "relocatable". I'm on a Win95 station (NT may be different) and I have no problem reading ds:, or any other part of of the PE headers or the code section. This works at run time (user mode?) as well as under a debugger. It seems that in Win95 all sections are readable, irrespective of the bit "readable" in the section table. I just saw a posting from Bogdan (on self-modifying code) that says code sections are writeable also, but I don't yet see how; I've been getting "invalid page fault" when I try it. I know a bit about virtual addresses, descriptors, selectors, etc. from the CPU manuals, and I thought that if the image base is huge but the program runs, then the OS must be able to load it anywhere; I could be wrong; I'll look into it.
I have no problem reading ds:, or any other part of of the PE headers or the code section. This works at run time (user mode?) as well as under a debugger.It's getting worse :) How could be ds: part of PE file? Are you writing 16 bit apps or 32 bit? If 32 then start of your PE file image(MZ) will be at ds: usualy by default MZ (start of PE) will be at ds:[400000h] if you changed your image base to 10000 0000h then MZ will be at ds:[1 0000 0000h] In both OS 0 - 0FFFFh virtual adress range for indicating 0 pointers and rise exeption. It's done especially for programmers BTW. In code like invoke GlobalLock,hObj... mov edx, in case when the API fails return value will be 0 and the try to get data from it shall rise exeption. It makes it esier to find code to replace it with invoke GlobalLock,hObj... or eax,eax je error mov edx, .... error:
:) getting worse, but interesting, at least. My asm source (for my own linker) is so unreadable that sometimes I myself can't follow it. So I will assemble a demo applet that you can look at under a debugger. It should be up later today (Monday) on www.hammick.com; click the link about Win32 programming. Bogdan has certainly got my attention with his remark on selectors 30 and 31. It sounds like a massive hole in security, but that's another story. Larry
It might just be me who's thick, but... how can selector 30 and 31 impose a problem if paging is utilized correctly (which, of course, it isn't...) -- what makes selector 30 and 31 so special? Guess I'll have to go home and load softice and have a peek :).
Larry 1. It wouldn't run in NT 2. To clarify subj of what I meant - Your PE has value of ImageBase - This is enough to load EXE at the virtual address pointed by ImageBase, if (and only if) it is in valid memory range of address range for user code. 1 0000 0000h IS valid. - ds wouldn't work not 'cause PE not readble but 'cause 0000 0000 0000 0000h is in closed range of virt. addr. PE is readable for your process so ds: is OK. PE is loading at the start of imagebase so MZ would be at the address. 3. 01001209 fn_01001209: 01001209 240F and al,0Fh 0100120B 0490 add al,90h ;A neat trick I learned from 0100120D 27 daa ;Intel documentation. 0100120E 1440 adc al,40h 01001210 27 daa ----------- change it for better code: and al, 00001111b ; mask out high nibble cmp al,10 sbb al,69h das
Larry, under win2000 zoomin.exe is 'not a win executable'
Larry, and generic.exe is olso not a win executable under win2000
Thanks, folks. I guess MS has improved it's protection so the PE headers are no longer readable. Surprised about Zoomin. I'll pull it down and try to fix it. Thanks again, Larry