i had something crossed my mind for now, and i hope
some can give more info..
could be Svin will give me more in sight since hes posting all those good Opcodes tutorials..
this is what i am trying to understand better:

while decoding the Opcodes (disassembly)
for example: Icezelion's message box example:

//******************** Program Entry Point ********
:00401000 6A00 push 00000000
:00401002 6800304000 push 00403000

* Possible StringData Ref from Data Obj ->"Win32 Assembly is Great!"
:00401007 6819304000 push 00403019
:0040100C 6A00 push 00000000

* Reference To: USER32.MessageBoxA, Ord:01BBh
:0040100E E80D000000 Call 00401020
:00401013 6A00 push 00000000

* Reference To: KERNEL32.ExitProcess, Ord:0075h
:00401015 E800000000 Call 0040101A

* Referenced by a CALL at Address:

* Reference To: KERNEL32.ExitProcess, Ord:0075h
:0040101A FF2500204000 Jmp dword ptr [00402000]

* Reference To: USER32.MessageBoxA, Ord:01BBh
:00401020 FF2508204000 Jmp dword ptr [00402008]

we see the reference to the MessageBox calling 00401020
so at 401020 we get: Jmp dword ptr [00402008]
what i was wonder is how the disasm engine strips the import from 402008 ? (what value does it need to look at the import table?)
this thing doesn't move out from my head and i hope someone will clear it up for me.
Posted on 2003-02-05 10:21:27 by wizzra
it seems that the : Jmp dword ptr [00402008]
points to the FirstThunk with RVA 00402008

================[ IMAGE_IMPORT_DESCRIPTOR ]=============
OriginalFirstThunk = 2054
TimeDateStamp = 0
ForwarderChain = 0
Name = USER32.dll
FirstThunk = 2008

but than, how do we know what dll we need to strip?
from the above jmp we go to 2008, but at user32.dll the app can import allot of function, so how the engine knows which import we need to show in the reference? by using the hint ?

Hint Function
443 MessageBoxA
Posted on 2003-02-05 10:52:41 by wizzra
too many have replyed for this thread that i am filled with information..=/
Posted on 2003-02-06 10:42:52 by wizzra
Something related to disassembly is one of topics that many of members would avoid. Maybe that is why you don't have replies.

To put it short, the answer is in the PE file format. When you study the PE file format, you will know how those are done by disassemblers. As a preview, just hexdump your .exe and see all those API names.
Posted on 2003-02-06 12:55:53 by Starless
avoiding disassemly?
well, i am not making any request for files or anything,
just wanna understand how things works..this kind of information should be free just like anything else.
i now know how the above examle works..
a sulotion will come shortly.
Posted on 2003-02-06 13:19:24 by wizzra
I think Iczelion's tutorials on the PE format will explain a lot. You might as well read some PE documentation. In this specific example, this is what happens:

The PE header contains a data directory (hiew has some nice options to view them). Each member of that directory identifies a specific type of data (exports,imports,resources,relocation,etc.). In this case, the import is at RVA 2010h and has a size of 3Ch bytes. RVA 2010h is at file offset 610h. At this position is an array of IMAGE_IMPORT_DESCRIPTOR structs. Each element of this array describes the imports for one DLL. One of the members of this struct (name1, +12dec from the start). The first element in iczelion's program has 206Ah (offset 66Ah) as name1, and at that offset is the string "KERNEL32.dll". The second element has RVA 2086 (offset 668h) as name1, pointing to "USER32.dll".
The first member of this structure (originalFirstThunk) contains a pointer to an IMAGE_THUNK_DATA array (at RVA 204Ch (offset 64C) here for the first one (kernel32)), terminated by a NULL dword. Each IMAGE_THUNK_DATA struct is a DWORD union that contains a pointer to a IMAGE_IMPORT_BY_NAME structure. In Iczelion's program, there's only one IMAGE_THUNK_DATA element for kernel32, namely 205Ch (=RVA, offset 65Ch). If you look at that offset, you will find an MAGE_IMPORT_BY_NAME structure, which consists of a hint (16-bit, 75h in this case), followed by a zero terminated string identifying the imported function (ExitProcess).

Now the final step is: how to get from the called address (dword ptr [402000]) to the right import? Back to the IMAGE_IMPORT_DESCRIPTOR (at RVA 2010h/offset 610h). This structure has another member called FirstThunk (+16dec from the start), very similar to OriginalFirstThunk. This member also contains an RVA to an array of IMAGE_THUNK_DATAs, but when the image is loaded, this array is replaced by the actual pointers to the loaded DLL's functions. The ExitProcess function was the first one in the array pointed to by OriginalFirstThunk, so it will also be the first one in the array pointed to by firstThunk. FirstThunk contains the RVA 2000h (=offset 600h), which is the array. The first index is of course at the same address so that explains why a call to [402000h] will call ExitProcess.
You can do a similar thing for the user32.dll, where firstThunk is RVA 2008h (offset 608h). MessageBox is the only import for user32.dll, so it will be at the first index as well (2008 that is). That's why the call to [402008h] is a call to MessageBox.

Posted on 2003-02-06 13:40:29 by Thomas
I'm not sure I unserstand the question, sorry.
But if it was about how some disasm knows that some
current call is call to particular API pocudure - it's known
from import section of disassembled prog.
Import section contains names of export files it needs
along with (usually but not always) name of procs it needs from those export files and places for
address of those procs inside importy (the addresses
are filled in start of app).
When some app calls to dwords in those places disassembler\debugger knows that the app calls to
imported function.
There are two common ways (there are other also) to
call values in places for addresses of imported fuction.
Let our import section specifys
1. I want system loader to load User32.dll
2. I want "MessageBoxA" function from this User32.dll
3. Place to place address of the function is 403200h

the app may call
push a
push b
push c
push d
Dissassemble knows that 403200h is address for
dword wich should contain address of MessageBoxA
function (it'll be placed there by system loader)
And often the disasm would show you that it is call
for MessageBoxA

another way to call it

call someaddress

wich ends also to calling MessageBox (call calls for
lable wich jumps to dword wich contain address of
MessageBoxA function)

There are Iczelion tuts about PE, wich explain it in details.
Posted on 2003-02-06 14:01:34 by The Svin