Hi,
I am coding an disassemble engine in MASM. On the ground of a stack watching I have to recognize _cdecl calls because they don't clean stack on exit in contrast to _stdcall, _fastcall, or _pascal.

Platform: All of Windows, all of executable forms.

Please, do you have some good idea how to recognize this?

Thanks to all.

:confused:
Posted on 2003-05-30 14:01:03 by MazeGen
tiny suggestion

C standard calling convention will default to something like

call ; call the function
mov , eax ; save the return value
add esp, <a number> ; restore the stack pointer, so it's C CALL

STDCALL to something

call ; call the function
mov , eax ; save the return value


do multiple (passive) passes, look for this and then you know which routine uses what?
It probably will choke on SMC though.
Posted on 2003-05-30 16:14:28 by Hiroshimator
wsprintf is the only Windows API that uses the C calling convention, that should make it easier to identify them ;)
Posted on 2003-06-01 04:54:54 by donkey
that's not true

the COM APIs have other routines that use C calling (and I'm sure there are others, not to mention handcrafted routines), *every* API with a variabele number of parameters uses it.
Posted on 2003-06-01 11:19:47 by Hiroshimator
Thanks for your suggestions.

SMC will be recognized in previous pass, it's no problem.

A problem arises when code uses anti-disassembling algorithm. One example:

push something1
push parameter
call ; call the function
mov , eax ; save the return value
pop something2 ; _cdecl: POP parameter, something2 is no longer needful
; non _cdecl: POP something1

Anti-disassembling algorithm can simulate _cdecl using POP reg (or ADD esp,<a number>) but it removes only "something1" from the stack, not the parameters.

I've found one solution today but it don't recognize this anti-disassembling trick so it's worthless.
I'm still researching...
:confused:
Posted on 2003-06-01 15:46:39 by MazeGen
So, the following is my resume:

In case of the code doesn't use an anti-disassembling algorith mentioned above, I've found an algorithm how to recognize _cdecl:

1) In this event is clear that stack items are no longer needful:
call ; call the function
mov , eax ; save the return value
add esp, <a number> ; restore the stack pointer, so it's C CALL

2) But if stack is restored using POP reg, is it not so clear:

push something1 ; don't know what is PUSHed - may be whatever
... ...
call far ; call the function
mov , eax ; save the return value
pop ecx ; restore the stack pointer
... ... ; the disassembler have to watch for the ECX now.
mov ecx,something3 ; If the value is no longer needful is it C CALL
; else it was POP something1 -> no C CALL!

When the disassemler doesn't know, whether the code uses an anti-disassembling algorithm, the only way is (as I think): To go inside a procedure CALL, to trap every instruction and to watch for the stack while instruction RETF :eek:
Posted on 2003-06-08 14:51:15 by MazeGen
Hi,

The problem with using pop to clear the stack, you have no idea whether the code is peserving registers or not. It is common fact that some codes have push and pops to preserve some registers before calling and after calling that function.

I have another suggestion. You can look into the retn opcode. The C calling function would just pop the return address of the stack and set the eip to the return address and leave the stack clearing to the caller. Thus mostly the return near opcode would be without an immediate for C calling convention. However this could be confused with functions that do not take any parameters and some weird codes. For example



strlenW:
pop eax
pop edx
push eax
xor ecx,ecx
_repeat:
mov eax,[edx+ecx*2]
add ecx,1
test eax,0ffffh
jz _out
add ecx,1
test eax,0ffff0000h
jnz _repeat
_out:
mov eax,ecx ;xchg eax,ecx to save one byte
retn
Posted on 2003-06-09 07:13:25 by roticv
Sorry, I don't understand which RETN opcode you mean. So I don't know why you think RETN would be without an immediate for C calling convention.
And what do this code does?
:confused:

Please, give me more details. Maybe I'm dumb :)
Posted on 2003-06-09 16:40:17 by MazeGen

1) In this event is clear that stack items are no longer needful:
call ; call the function
mov , eax ; save the return value
add esp, <a number> ; restore the stack pointer, so it's C CALL
Not quite right. If the function is in an expression (example: total = toint(chr) + temp;), EAX will be used after the stack is restored.
    push  chr        ; stack arg

call toint ; call function
add esp,4 ; restore stack
add eax,temp ; finish calculation
mov total,eax ; store
I have also seen this optimization in some compilers:
    push  arg3

push arg2
push arg1
call func1
push arg4
push arg3
push arg2
push arg1
call func2
add esp,7*4 ; clear arguments of both function calls
Posted on 2003-06-09 17:00:42 by tenkey
Well the code is just a strlen for unicode. Just an example code using


function:
pop eax
pop reg;first parameter
pop reg;second parameter
push eax
...
retn

retn = return near (C3h)
A retn without an immediate means that the function does not clear the stack, just pop the return address of the stack.
Posted on 2003-06-10 04:16:30 by roticv
Hi,
for roticv:
Yes, I understand now. I've forgotten to emphasize that the problem is about FAR calls where I can't go inside a procedure CALL. In case of CALL NEAR is it simple: the disassembler goes inside and watches for the stack including RETN instruction. In case of CALL FAR disassembler don't because it is performing on CPL 3 and not all FAR calls (as I think for example WinAPIs - I don't have direct experience) are performed on CPL 3 (apart from a time expenditure).

For tenkey:
Thanks for new suggestions.
First code example is OK, from disassembler's point of view it doesn't matter whether instruction MOV , EAX is before clearing argument or not.
Second example is exactly that what I need. If you have more examples, please send it. I have to take into consideration.

;)
Posted on 2003-06-10 15:24:40 by MazeGen