My original background is Motorola and has been since the 70's. Wrote assembler code for a variety of ancient platforms and some modern ones as well. Took 10 minutes to learn the 68xxx Motorola processor instruction set (funny story there) and about 4 minutes to read up on the Intel set, which varies some. The Intel instruction set is slightly different, offers more than just 32 bit registers which I'm somewhat not used to, but they're easily understandable.

One thing I'm not used to (so to speak), are macros. Never used them (yes, hard coding actually keeps you sharp), so when I did write some MASM based code, I was able to use includes, but I'm writing dynamically relocatable code duplicates of Motorola code I wrote (which is all dynmically relocatable, including the data) and have gotten their Intel counterparts done, except for the Windows interface. I've scoured the 'net looking for references as to how to reference API calls directly, rather than let the compiler do it, so I can place instructions inside a block of code and when I move it, it will work anywhere (as I've seen MASM toss Calls to a Jmp outside of the normal code segment). I'm fairly certain Iczelion has info on this (along with others). Though Iczelion may be judging a Gainax contest, so he may be blinded when he reads this (or with a big smile on his face).

Coming into the Intel chipset (somewhat for the second time in the past 10 years), I understand the segmentation that Intel uses, though I don't understand the Windows inclusion into the VM. (Which also leads me to wonder where Windows IS in the grand scheme of the machine.)

I've already been writing working code, routine to split up ip/dns entries and reformat and remove redundancies (I'll optimize it more as I want it as fast as possible, because it'll be system hooked and I don't want any slowdowns). I've looked at a variety of RAD environments (WinAsm seems more promising of the group) and I'm still looking to find a designing tool for windows, controls, etc.

What I'm looking for is the references that MASM uses for the indirect jumps for API calls, so I can do the evil myself. :grin:
Posted on 2003-07-30 11:04:26 by FunkyMeister
Moved to Heap.

(I can be cruel, yes :grin: )
Posted on 2003-07-30 11:30:24 by bazik
Hutch has included a utility in the MASM32 package, called L2EXTIA, that builds INCLUDE files from LIB files, which eliminates the API jump table. It works with the SDK lib files, and I've used it with Visual Studio 6 LIB files with no problems so far.

:)
Posted on 2003-07-30 12:02:23 by S/390

Hutch has included a utility in the MASM32 package, called L2EXTIA, that builds INCLUDE files from LIB files, which eliminates the API jump table. It works with the SDK lib files, and I've used it with Visual Studio 6 LIB files with no problems so far.

:)


I've tried that, but I'm uncertain how that helps, since there's no base references for the tables. IE: How do I call that in memory directly? I have 'Inc'ed all the libs already, but wasn't sure how to actually get the actual address values that MASM does for the Calls.
Posted on 2003-07-30 12:52:43 by FunkyMeister

Moved to Heap.

(I can be cruel, yes :grin: )


I could say that was a Heapload of help, but... I typed it instead. :grin:
Posted on 2003-07-30 12:54:31 by FunkyMeister

I've scoured the 'net looking for references as to how to reference API calls directly, rather than let the compiler do it, so I can place instructions inside a block of code and when I move it, it will work anywhere (as I've seen MASM toss Calls to a Jmp outside of the normal code segment). I'm fairly certain Iczelion has info on this (along with others).


I use VSNET as my IDE. VSNET defaults to /INCREMENTAL linking. The /INCREMENTAL linker flag will ad the jump table. In fact it will stilladd the jumps to the exe, even if you use direct call. Thoses jump will just never be reached.


I have been slowly remaking the include file that come with MASM32 to all API calls being indirect calls(no jump table). By memory, I belive this is what I do: (kernal32.inc)


externdef _imp__GetModuleHandleA@4:NEAR
GetModuleHandle TEXTEQU _imp__GetModuleHandleA@4

Now you should be able to get indirect calls, a call to a pointer to the proc entry.
I'll check when I get home if I remeber right(in about 11hrs).
Posted on 2003-07-30 21:38:01 by ThoughtCriminal
Funky,

The IMPORT entries, with or without the jump table, are resolved by the "Windows Loader" (for lack of a better term). When Win loads your program, it replaces these items with the address of the DLL routines that are already loaded and known by the OS.

So my guess would be, even if you do "relocate" your code, the IMPORT address entries wouldn't change...

:) :)
Posted on 2003-07-31 00:31:31 by S/390



I use VSNET as my IDE. VSNET defaults to /INCREMENTAL linking. The /INCREMENTAL linker flag will ad the jump table. In fact it will stilladd the jumps to the exe, even if you use direct call. Thoses jump will just never be reached.


I have been slowly remaking the include file that come with MASM32 to all API calls being indirect calls(no jump table). By memory, I belive this is what I do: (kernal32.inc)


externdef _imp__GetModuleHandleA@4:NEAR
GetModuleHandle TEXTEQU _imp__GetModuleHandleA@4

Now you should be able to get indirect calls, a call to a pointer to the proc entry.
I'll check when I get home if I remeber right(in about 11hrs).


So the external def above would then resolve so I could:

Jmp (GetModuleHandle)

And that would contain the "magic cookie" (for a lack of a better term) that the PE loader looks for and changes to the proper GetModuleHandleA address. If so, that's extremely useful. :) :alright:
Posted on 2003-07-31 07:41:09 by FunkyMeister

Funky,

The IMPORT entries, with or without the jump table, are resolved by the "Windows Loader" (for lack of a better term). When Win loads your program, it replaces these items with the address of the DLL routines that are already loaded and known by the OS.

So my guess would be, even if you do "relocate" your code, the IMPORT address entries wouldn't change...

:) :)


I think Thought is on the right track, though the method is somewhat new to me, I do understand the external define, though not quite sure what value is coming back till I plunk that into some code to see if I BSOD. :)
Posted on 2003-07-31 07:43:24 by FunkyMeister

I agree S/390, the actual addresses in the jump table are resolved by the PE loader at run-time. Neither LINK or ML can know in advance where any procedure will be located at compile time. Given that different versions of the DLL may have different addresses for each procedure and that the DLL may be loaded into a different location at runtime it must be dynamically linked. You would have to scan backwards for the entry point for Kernel32 and use LoadLibrary, GetProcAddress to build your own api calls at runtime but you would be limited to push/push/call, there are examples on Iczelions page of how to do this.


Actually, I don't really need to know where the adress is either, I just want to be able to have that jmp (...) relocated to be included IN my code block as apposed to being tagged at the end and called outside of it to an indirect jump. I want to replicate what the assembler does, but include it into my code and just merely call the label where it is to get the function. IE:

GetMod: Jmp(GetModuleHandle) ; Offset "cookie" that the PE converts at load-time?
Posted on 2003-07-31 07:49:04 by FunkyMeister
Heya, didn't see this thread until now. You want to relocate parts of your code runtime? Mmmkay. I assume you already know that CALL and JMP/Jcc are eip-relative, and will only work within the code blocks you relocate, unless you relocate everything the same amount - which is why you're also having problems with the API calls.

There's already been hinted at the IAT (Import Address Tables), and the "dummy thunks" (ie, CALL j_MessageBoxA stuff). What you need is a way of "call dword ptr " instead of "call MessageBox". The reason you need this, is that indirect calls/jumps on x86 have the non-relative address in the opcode, instead of the relative ones that normal call/jcc has.

Normal import libraries define a few symbols per import. You get stuff like MessageBoxA@16 and imp__MessageBoxA@16 - and as you might or might not already know, many APIs have both narrow/Ansi and unicode/Wide forms - it's the assembler (include files) that handle the "friendly name mapping" of MessageBox -> MessageBoxA or MessageBoxW - and, in the case of masm, the @NumberOfBytes is handled by masm, according to the PROTO.

So... you need to have MessageBox textequ'ed to imp__MessageBoxA@16 instead of MessageBoxA (which is protoed), and you need to set up some externdefs for the imp_* DWORDs. I believe the aforementioned tool of hutch'es might do that - so this post was mainly to shed a little light (I hope) on how/what/why.

Feel free to ask if there's anything that didn't come out clearly enough, this heat and humidity and lack of sleep is killing me.
Posted on 2003-07-31 10:49:18 by f0dder
Finally home :)

There are 2 types of calls used in Win32 assem. Direct and indirect, often reffered to on this board as an E8(direct) or FF(indirect), where E8 andd FF are the first hex byte of the opcode.

E8 works relitive to EIP form the location of the call:

E8 +/- byte distance to the proc being called

FF reg/mem The reg or memory contains a pointer to the proc entry.

E8 is hard, or maybe impossible to get to work with dlls(maybe libs)at compile time. I'm no expert on using this form. The jump table uses this form.

E8 +jump table offset where there is a jmp to the adress of the proc.

From the NET help:

The /INCREMENTAL option controls how the linker handles incremental linking.

By default, the linker runs in incremental mode. To override a default incremental link, specify /INCREMENTAL:NO.

An incrementally linked program is functionally equivalent to a program that is nonincrementally linked. However, because it is prepared for subsequent incremental links, an incrementally linked executable (.exe) file or dynamic-link library (DLL):

Is larger than a nonincrementally linked program because of padding of code and data. (Padding allows the linker to increase the size of functions and data without recreating the .exe file.)
May contain jump thunks to handle relocation of functions to new addresses.
Note To ensure that your final release build does not contain padding or thunks, link your program nonincrementally.


Its been awhile since I checked, but I belive non-incremental linking will give you FF style calls by default.

Since you seem to prefer hard coding to macros (like me), this might be a good time to talk about the nutty thing I've been making MASM do.

Here is an example of preping imports for use with indirect call and invokable.

In you in file(if you prefer):


externdef _imp__Direct3DCreate9@4:NEAR
Direct3DCreate9 EQU FCALL@4 PTR _imp__Direct3DCreate9@4

Whats FCALL@4 you say? Well you can use your own names but the process is the syntax is the same:


LCALL@0 TYPEDEF proto
FCALL@0 TYPEDEF PTR LCALL@0
LCALL@4 TYPEDEF proto :dword
FCALL@4 TYPEDEF PTR LCALL@4
LCALL@8 TYPEDEF proto :dword,:dword
FCALL@8 TYPEDEF PTR LCALL@8
ect....

It works. I cant fully explain why, but it works.


invoke Direct3DCreate9,NULL

Now for the nutty stuff:

My kernael32.inc now only uses externdefs:


externdef _imp__GetModuleHandleA@4:NEAR

No equates, because... I create the imports in a data section:


_DATA SEGMENT
__imp__ExitProcess@4:
dd offset _imp__ExitProcess@4
__imp__GetModuleHandleA@4:
dd offset _imp__GetModuleHandleA@4
__imp__VirtualAlloc@16:
dd offset _imp__VirtualAlloc@16
__imp__VirtualFree@12:
dd offset _imp__VirtualFree@12
_DATA ENDS

And make a structure definition:


_KERN STRUC
ExitProcess FCALL@4 __imp__ExitProcess@4
GetModuleHandle FCALL@4 __imp__GetModuleHandleA@4
VirtualAlloc FCALL@16 __imp__VirtualAlloc@16
VirtualFree FCALL@12 __imp__VirtualFree@12
_KERN ENDS

Then in code:


ASSUME ecx:PTR _KERN

lea ecx,_imp__ExitProcess@4 ;This is the address of the top element.

invoke [ecx].GetModuleHandle, NULL

Why do this this way? I'm not really sure yet, but I am assuming call reg+index is faster than call reg mem.
(You did say you want fast)

Reg+index is shorter in code bytes than reg mem by about half.

kernel32.lib is in alphabetical order, so you cannot put a much used API at the top of the _DATA section. Reg+0 is 2 bytes, reg+4 is 3 bytes... reg+260 probaly 4 byte, but reg mem is 6. So if your are making 64k demos, this is better. But is it faster?. Your import struct must be in the same order as the imports are in the lib.

Lately I have been figuring how to cram anything that can be cramed into strucs, into strucs.

I have sucessfully cramed imports.
Posted on 2003-07-31 12:35:32 by ThoughtCriminal

Its been awhile since I checked, but I belive non-incremental linking will give you FF style calls by default.

No. Your input assembly code determines which will be used.

Hardcoding sucks btw - we're not in the 70s anymore, and why not use features that makes life easier with nada size/speed overhead?

Also, don't assume - test. And do realize that your kind of optimizations are utterly useless on calls to API, apart from perhaps making you "feel better". I'd spend my time on more important parts.
Posted on 2003-07-31 12:42:48 by f0dder

Heya, didn't see this thread until now. You want to relocate parts of your code runtime? Mmmkay. I assume you already know that CALL and JMP/Jcc are eip-relative, and will only work within the code blocks you relocate, unless you relocate everything the same amount - which is why you're also having problems with the API calls.


Yes, I understand the Call/Jxx are EIP relative (that was actually the easy part), the hard part was referencing a Windows call within the same code block.

Originally posted by f0dder
There's already been hinted at the IAT (Import Address Tables), and the "dummy thunks" (ie, CALL j_MessageBoxA stuff). What you need is a way of "call dword ptr " instead of "call MessageBox". The reason you need this, is that indirect calls/jumps on x86 have the non-relative address in the opcode, instead of the relative ones that normal call/jcc has.


That call dword ptr, would the PE fix that?

Originally posted by f0dder
Normal import libraries define a few symbols per import. You get stuff like MessageBoxA@16 and imp__MessageBoxA@16 - and as you might or might not already know, many APIs have both narrow/Ansi and unicode/Wide forms - it's the assembler (include files) that handle the "friendly name mapping" of MessageBox -> MessageBoxA or MessageBoxW - and, in the case of masm, the @NumberOfBytes is handled by masm, according to the PROTO.


I know about Ansi and Wide (Unicode) versions of functions though I wasn't aware of the @ usage completely. A bit of learning from Motorola code, but this is just the differences in addressing methods, other than that the instructions are very simular.

Originally posted by f0dder
So... you need to have MessageBox textequ'ed to imp__MessageBoxA@16 instead of MessageBoxA (which is protoed), and you need to set up some externdefs for the imp_* DWORDs. I believe the aforementioned tool of hutch'es might do that - so this post was mainly to shed a little light (I hope) on how/what/why.


I've got .inc's listing all of them, I'll tinker with the WinAsm and see if I can't GPF the thing a few times. :grin: Not like I haven't been. No, it's not WinAsm that I'm GPFing, it's the code I'm running. Need to look at OllyDbg.

Originally posted by f0dder
Feel free to ask if there's anything that didn't come out clearly enough, this heat and humidity and lack of sleep is killing me.


externdef _imp__GetModuleHandleA@4:NEAR
GetModuleHandle TEXTEQU _imp__GetModuleHandleA@4

That tells there's an external def (understand the externdef and the TEXTEQU), therefore if I do:
GethMod: Jmp(GetModuleHandle)

And I'd merely Call GethMod to do the call. (I hope I'm understanding this properly.) Please, just humor me and tell me it's not that bloody simple. Hmm, actually re-reading what you said, I'm not sure I am understanding it, those references (GetModuleHandleA for example) aren't "direct" addresses, but an offset in a call table that the PE (said above) relocates as it loads, will the PE still do that on a Call or just Jmp's?

And lastly, Sleep? Whats that? Isn't that an API call, for others to do work while you sit still? Hmm...
Posted on 2003-07-31 23:47:31 by FunkyMeister

Finally home :)

There are 2 types of calls used in Win32 assem. Direct and indirect, often reffered to on this board as an E8(direct) or FF(indirect), where E8 andd FF are the first hex byte of the opcode.

E8 works relitive to EIP form the location of the call:

E8 +/- byte distance to the proc being called

FF reg/mem The reg or memory contains a pointer to the proc entry.

E8 is hard, or maybe impossible to get to work with dlls(maybe libs)at compile time. I'm no expert on using this form. The jump table uses this form.

E8 +jump table offset where there is a jmp to the adress of the proc.


Hmm, I think I know why coffee is so expensive over there. :grin: It's so programmers don't get fully wired up and not able to sleep for months. Sleep, now that's a luxury programmers often skip. :)

I'll read this in the AM (in a few hours, really just getting a nap in between coding sessions, gonna see if I can BSOD this sucker before I hit the bed, so I can just turn it off fast). :grin:

I will read that again when I've regenerated some, so my brain can unstrain.
Posted on 2003-07-31 23:55:21 by FunkyMeister
Originally posted by f0dder
No. Your input assembly code determines which will be used.


I wondered about that, though figured it would be, since the assembler decodes what you wrote into a usable instruction.

Originally posted by f0dder
Hardcoding sucks btw - we're not in the 70s anymore, and why not use features that makes life easier with nada size/speed overhead?


Yes, it does, but I'm used to hard coding from Motorola (those evil, no wait, nice people, yes, be nice up front, then stab in back while not around ). As for using the features, yes, plan to on other projects, but the one I need this for, needs a relocatability but the assembler keeps sticking the api calls outside of the chunk I want to move as a block, so all I'm trying to do is to keep it in the same chunk (if there's a way to tell the assembler to stop being a bonehead and keep it together, then that'll make my life easier).

Originally posted by f0dder
Also, don't assume - test. And do realize that your kind of optimizations are utterly useless on calls to API, apart from perhaps making you "feel better". I'd spend my time on more important parts.


If testing involves GPF's, yes, I will. :) I know the speed thing was for Coffee Deprived above (smile, I'm tired), though I'm trying to cut down on processor abuse to get a decent speed out of the code (the faster the better as it'll get used and abused a lot), so my overall code will actually be faster.

I'm off to BSOD. :)
Posted on 2003-08-01 00:01:46 by FunkyMeister
Hardcoding sucks btw - we're not in the 70s anymore, and why not use features that makes life easier with nada size/speed overhead?


I have yet to see a macro tutorial, until then hard coding for me.

Also, don't assume - test. And do realize that your kind of optimizations are utterly useless on calls to API, apart from perhaps making you "feel better". I'd spend my time on more important parts.


That confirms what I thought. Unless you are coding for size, it wont make much of a speed differrence.



f0dder- I just do this for fun. Any program I get my hands on, I proceed to abuse and misappropriate functionality or force it into unintended uses.
Programming, 3D-software, or our game editor. This keeps me sharp as the tech lead under the lead programmer. Making games is a very technical job. Doing "think outside the box" stuff like this keeps me ready to "think-outside-the-box" at my job. Learning how to break the program, fix the program, or force something to work are all skills that needed. Games tend to use bleeding edge tech, so there is not always a clear body of work and a proper way to do things.

After all this fun with MASM strucs, I've been think of turning some of my attention to C++ and seeing what kind standard abusing monstrosites I can implement. :grin:
Posted on 2003-08-01 01:39:06 by ThoughtCriminal
ThoughtCriminal, it's a-okay doing it for fun - just silly if you end up doing useless micro-optimizations in production code ^_^

FunkyMeister, let's see if I can explain this properly :).

the __imp_* are just dwords, as you already know. You can use them by "call dword ptr [__imp_MessageBoxA@16]", and your code will be fully relocatable, since indirect call/jmp has the address hardcoded direct as part of the opcode instead of having it relative (of course this means the import table will have to remain in place, and you can't move the code to another process adress space - but you can relocate it freely in your own memory).

The thing that makes the __imp_* DWORDs special, is the sections and order they're placed in... the linker puts them in a section of the PE Executable known as the "import table" - which means windows will be fixing up the DWORDs when your exe loads.

With your

externdef _imp__GetModuleHandleA@4:NEAR
GetModuleHandle TEXTEQU _imp__GetModuleHandleA@4

you should be able to do "call GetModuleHandle", or at least "call ", or perhaps "call dword ptr " - nasm and it's wonderful ambiguous syntax. I prefer the "call " since it shows what you're doing - you're calling the memory address stored in the GetModuleHandle DWORD variable, where "call GetModuleHandle" looks too much like "transfer exectuion to the address of GetModuleHandle". It's not too important wrt. API calls since it's obvious what you want - I still like being explicit in my assembly routines though; I only write asm when I _need_ to be that specific anyway ;)
Posted on 2003-08-01 07:11:25 by f0dder

ThoughtCriminal, it's a-okay doing it for fun - just silly if you end up doing useless micro-optimizations in production code ^_^

FunkyMeister, let's see if I can explain this properly :).

the __imp_* are just dwords, as you already know. You can use them by "call dword ptr [__imp_MessageBoxA@16]", and your code will be fully relocatable, since indirect call/jmp has the address hardcoded direct as part of the opcode instead of having it relative (of course this means the import table will have to remain in place, and you can't move the code to another process adress space - but you can relocate it freely in your own memory).


Okay, so I understand the import, but don't understand how I obtained the address directly from the __imp_MessageBoxA@16, the externdef was able to determine the proper offset?

Originally posted by f0dder
The thing that makes the __imp_* DWORDs special, is the sections and order they're placed in... the linker puts them in a section of the PE Executable known as the "import table" - which means windows will be fixing up the DWORDs when your exe loads.


That does make sense, I figured thats why they were all winding up where they did.

Originally posted by f0dder
you should be able to do "call GetModuleHandle", or at least "call ", or perhaps "call dword ptr " - nasm and it's wonderful ambiguous syntax. I prefer the "call " since it shows what you're doing - you're calling the memory address stored in the GetModuleHandle DWORD variable, where "call GetModuleHandle" looks too much like "transfer exectuion to the address of GetModuleHandle". It's not too important wrt. API calls since it's obvious what you want - I still like being explicit in my assembly routines though; I only write asm when I _need_ to be that specific anyway ;)


Hmmm, now here's the big bus in the works (wrench is too small). I'm doing an "as needed" approach to pulling code into memory. (IE: A 'la Macintosh) When instructions that are calling functions in the main routine are needed, they're added to the local jump table and pulled in from a segment of the file, sort of like the swap space does with ram. Problem is, will doing it this way, ensure the code that just got pulled in (call it a chunk for a lack of a better term, been coding on less sleep than the other day), will have the correct address for the API calls doing this method? Or did the bus just crash? (Bus, oops, I am tired, guess it works both ways for this.) :D

And the biggest annoyance, no BSOD last night. Bloody thing worked. Though I ran into some weirdness with subclassing MSComCtlLib.ListView, the window messages coming in were, to say the least, not normal. I'll have to check to see how the dragmode is setup on it, chances are I've messed that and caused myself more grief than I want.

Off for that horizontal thing called a nap. (Like I sleep...)
Posted on 2003-08-01 23:38:28 by FunkyMeister

I'm doing an "as needed" approach to pulling code into memory. (IE: A 'la Macintosh)

Windows is doing this already - pages are pulled in from the executable as needed.


Problem is, will doing it this way, ensure the code that just got pulled in (call it a chunk for a lack of a better term, been coding on less sleep than the other day), will have the correct address for the API calls doing this method?

Humm... if they're referencing the import table, it should work... but I guess it depends on details of your scheme. Sounds like you're doing weird stuff ^_^
Posted on 2003-08-03 10:52:15 by f0dder