Hi,

Sorry if this has been asked before, but I'm having trouble understanding how MASM balances the stack.

I've been disassembling a couple of programs to see how MASM generated the code and I came across this oddity. The program I disassembled was Iczelion's Tutorial 3 WIN program.

Now I know that each time new data is added to the stack the stack pointer is decremented. So it makes sense that undoing that would involve adding to the stack. This much I understand however, what I don't understand is why MASM added the amount that it did.

Here is the program that I was disassembling:

.386

.model flat,stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib

WinMain proto :DWORD,:DWORD,:DWORD,:DWORD

.data
ClassName db "SimpleWinClass",0
AppName db "Our First Window",0

.data?
hInstance HINSTANCE ?
CommandLine LPSTR ?
.code
start:
invoke GetModuleHandle, NULL
mov hInstance,eax
invoke GetCommandLine
mov CommandLine,eax
invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT
invoke ExitProcess,eax

WinMain proc hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
LOCAL wc:WNDCLASSEX
LOCAL msg:MSG
LOCAL hwnd:HWND
mov wc.cbSize,SIZEOF WNDCLASSEX
mov wc.style, CS_HREDRAW or CS_VREDRAW
mov wc.lpfnWndProc, OFFSET WndProc
mov wc.cbClsExtra,NULL
mov wc.cbWndExtra,NULL
push hInstance
pop wc.hInstance
mov wc.hbrBackground,COLOR_WINDOW+1
mov wc.lpszMenuName,NULL
mov wc.lpszClassName,OFFSET ClassName
invoke LoadIcon,NULL,IDI_APPLICATION
mov wc.hIcon,eax
mov wc.hIconSm,eax
invoke LoadCursor,NULL,IDC_ARROW
mov wc.hCursor,eax
invoke RegisterClassEx, addr wc
INVOKE CreateWindowEx,NULL,ADDR ClassName,ADDR AppName,\
WS_OVERLAPPEDWINDOW,CW_USEDEFAULT,\
CW_USEDEFAULT,CW_USEDEFAULT,CW_USEDEFAULT,NULL,NULL,\
hInst,NULL
mov hwnd,eax
invoke ShowWindow, hwnd,SW_SHOWNORMAL
invoke UpdateWindow, hwnd
.WHILE TRUE
invoke GetMessage, ADDR msg,NULL,0,0
.BREAK .IF (!eax)
invoke TranslateMessage, ADDR msg
invoke DispatchMessage, ADDR msg
.ENDW
mov eax,msg.wParam
ret
WinMain endp

WndProc proc hWnd:HWND, uMsg:UINT, wParam:WPARAM, lParam:LPARAM
.IF uMsg==WM_DESTROY
invoke PostQuitMessage,NULL
.ELSE
invoke DefWindowProc,hWnd,uMsg,wParam,lParam
ret
.ENDIF
xor eax,eax
ret
WndProc endp
end start


The function that is balanced weird is WinMain.

The function starts out in the disassembly like this:


PUSH ebp
MOV ebp, esp
ADD esp, 0xB0


0xB0 is hexidecimal for 176. Now what I don't understand is why 176. After reading the first of Iczelion's tuts I was under the impression that you added to the stack pointer the amount of bytes that you pushed on to the stack. In his example, after pushing 3 DWORD(I assume) parameters on to the stack, he added 12 to the stack pointer.

    push  [third_param]               ; Push the third parameter

push [second_param] ; Followed by the second
push [first_param] ; And the first
call foo
add sp, 12 ; The caller balances the stack frame


Reading this I assumed that for each 4 byte parameter you add 4 to the stack. So for 3 parameters, you add 12(3 * 4) correct?

But the WinMain function only accepts 4 parameters each of DWORD size. So 4 * 4 = 16 not 176. So why 176? I figuared it must have something to do with the local variables that are intialized at the beginning of the function. Turns out that can't be the case either.

Three variables are intialized in the WinMain function each of a certain type: WNDCLASSEX, MSG, HWND. I figuared that each of these structures must add up to 176 bytes or at least 160 bytes(176 - 16). WNDCLASSEX is about 48 bytes.

Here it is in BASIC Style Code:
 WNDCLASSEX Structure

cbSize as DWORD
style as DWORD
lpfnWndProc as DWORD
cbClsExtra as DWORD
cbWndExtra as DWORD
hInstance as DWORD
hIcon as DWORD
hCursor as DWORD
hbrBackground as DWORD
lpszMenuName as DWORD
lpszClassName as DWORD
hIconSm as DWORD
WNDCLASSEX EndStructure


MSG is about 28 bytes:



POINT Structure
x as DWORD
y as DWORD
POINT EndStructure

MSG Structure
hwnd as DWORD
message as DWORD
wParam as DWORD
lParam as DWORD
time as DWORD
pt as POINT
MSG EndStructure


hwnd is a DWORD so that's 4 bytes.

Add them all together we get about 80 bytes. That's half of what I'd need in order to meet 160 and even less for 176.

So after trying my hardest to figuare out what exactly I'm missing I've decided to ask you guys. Again, I'm sorry if this has been asked before but a search of the forum didn't turn up anything usefull for me so I had to start a new topic.

Any help anyone can give me as to why and how the stack should be balanced here will be greatly appreciated. Thank you.
Posted on 2004-09-07 01:25:24 by Anon32
Yours disassembler is wrong. It should be


.text:00401031 ; int __stdcall sub_401031(HINSTANCE hInstance)
.text:00401031 sub_401031 proc near ; CODE XREF: start+26p
.text:00401031
.text:00401031 hWnd = dword ptr -50h
.text:00401031 Msg = MSG ptr -4Ch
.text:00401031 var_30 = WNDCLASSEXA ptr -30h
.text:00401031 hInstance = dword ptr 8
.text:00401031
.text:00401031 push ebp
.text:00401032 mov ebp, esp
.text:00401034 add esp, 0FFFFFFB0h


However I still do not know why there is an add esp, -0B0h. The "add" is for local variables, but the local variables are not that big from what I can see. Of course alignment does not make sense, since i only hear of stack needing to be aligned to dword.

Anyway stdcall calling convention is usually used. Callee function clears up the stack. No need to the caller function does not need to balance the stack so


push [third_param] ; Push the third parameter
push [second_param] ; Followed by the second
push [first_param] ; And the first
call foo
add esp, 12 ; The caller balances the stack frame

Is not the normal case.

Anyway the stack is balanced at the end of the routine with


.text:00401114 leave
.text:00401115 retn 10h
Posted on 2004-09-07 02:34:38 by roticv
83 C4 B0 is ADD ESP,-50h. That's just the size of those three items. The disassembler is slightly in error.
Posted on 2004-09-07 12:00:37 by Sephiroth3
Thanks Sephiroth3 and roticv!

Out of curiousity, what disassembler do you guys use?
Posted on 2004-09-08 05:06:15 by Anon32
ida
Posted on 2004-09-08 05:22:31 by roticv
83 C4 B0 is ADD ESP,-50h. That's just the size of those three items. The disassembler is slightly in error.

Don't worried it, that's for human eyes only. Assembler (ex: MASM, TASM...) know how do this:

test.asm (My test source.)
.code

start:
add eax, 0ffffffb0h
add eax, -50h
add eax, 0ffb0h
add eax, 0b0h
add eax, 0ffffb000h


test.lst (MASM assembled)
 00000000			        .code

start:
00000000 2 83 C0 B0 add eax, 0ffffffb0h
00000003 2 83 C0 B0 add eax, -50h
00000006 2 05 0000FFB0 add eax, 0ffb0h
0000000B 2 05 000000B0 add eax, 0b0h
00000010 2 05 FFFFB000 add eax, 0ffffb000h
Posted on 2004-09-08 08:27:34 by Kestrel
There is a magic rule for balancing the stack, its like the law of gravity, what goes up must come down. For however many bytes you use on the stack at procedure entry, you must correct it at the procedure exit.

With a stack frame you reverse the usage of ESP/EBP, you can use the technique used by MASM with LEAVE as it is designed to defeat a stall on exit or if you don't use a stack frame, you can use the version of RET that corrects the stack by a set number of bytes.

Convention has it that you use ESP+ addresses for parameters, ESP- addresses for locals and it generally works OK but there is no hard and fast rule as to where you start your local addresses and if you bother to look at some of the later VC generated code, it often puts locals at ESP+ addresses as part of its optimisation.

As long as you observe the requirement for the stack to be the same on exit as entry, you can do more or less whatever you like.
Posted on 2004-09-08 19:48:39 by hutch--
remember various calling conventions: STDCALL has the called function clean the stack, C has the caller clean the stack. FASTCALL is evil, it puts some parameters in registers, others in stack :)
Posted on 2004-09-08 20:22:21 by f0dder