your size optimization has nothing to do
with our discussion of LOCAL wc vs GLOBAL.
Size of exe is small for the only reason:
you didn't use .data section placing all asciiz constant
into .code section.
You don't need to put it this way:
jmp lbl
someasciiz db 'Text',0
It waste of clocks and size.
If you want to have them in code section
just declare them after .code statemnt but BEFORE
entry point
txtvar db 'Some text',0
somedata dd 12345

This way you put it into code section but don't need to jump
over the data.

OK to our discussion:
I failed to compile your source using your make file and inc:
(paths were changed, of course)
It asked me for proto.
So I added line in inc file
it is your macro.
This way it was compiled but size was 2048.
Might be different LINK or whatever. I don't know yet.
However it is irrelevant to discussion: I show later why.

As I said size is small only because of not using .data section,
so I think it was fare enough to place WNDCLASSEX sturture in
code section also.
For clarity of experiment I didn't optimize anything (though
some places looks starnge - for example you have 0 in esi
but don't use it all time - in LoadIcon,LoadCursor for example)

What I did:
I placed WNDCLASSEX struct with predefined members in code

; ?????????????????????????????????????????????????????????????????????????
wc WNDCLASSEX <sizeof wc,CS_VREDRAW or CS_HREDRAW,offset WndProc,,,400000h,,,\
COLOR_BTNFACE+1,,offset szClassName,>


then I commented declaretion of local wc
and all code that moves values to members that are already filled in wc.
I left only code for members that need to be fill in runtime using edi as
pointer to wc
mov edi, offset wc ; edi = pointer to wc
assume edi: ptr WNDCLASSEC
it costs 5 extra bytes for mov edi,imm32
but make addressing to members as short as in your code
(locals addressed by ebp+offset)
and also save bytes that you waste in loading address of wc in eax
RegisterClassEx, addr wc ; addr wc = lea eax,wc push eax

That's all I did, I didn't change anithing else exept for specifing .code section
as writable in makefile.
And I got the same size exe.

Now let's look inside both of them and extract the only data that is matter
for our discussion:
In your case you fill all members but don't waste size for predefined structure
In my case I waste 48 bytes of code section for predefined structure
but don't waste size for code to fill members that is already filled in design time
LOCAL wc approach:

00401016 |. BF 00004000 MOV EDI,SMALLWIN.00400000
5 bytes
0040103A |> C745 B0 30000000 MOV [LOCAL.20],30
00401041 |. C745 B4 03000000 MOV [LOCAL.19],3
00401048 |. C745 B8 FF104000 MOV [LOCAL.18],SMALLWIN.004010FF
0040104F |. 8975 BC MOV [LOCAL.17],ESI
00401052 |. 8975 C0 MOV [LOCAL.16],ESI
00401055 |. 897D C4 MOV [LOCAL.15],EDI
00401058 |. C745 D0 10000000 MOV [LOCAL.12],10
0040105F |. 8975 D4 MOV [LOCAL.11],ESI
00401062 |. C745 D8 1D104000 MOV [LOCAL.10],SMALLWIN.0040101D ; ASCII "smallwin_Class"
47 bytes
00401076 |. 8945 C8 MOV [LOCAL.14],EAX
00401079 |. 8945 DC MOV [LOCAL.9],EAX

6 bytes
00401089 |. 8945 CC MOV [LOCAL.13],EAX
0040108C |. 8D45 B0 LEA EAX,[LOCAL.20]

6 bytes
5+47+6+6 = 64 bytes
Predefined wc approach
size of WNDCLASSEX = 30h = 48 bytes
004010A3 |. BF 40104000 MOV EDI,Smallwin.00401040 ;pointer to wc
5 bytes
fill icon handlers
004010B4 |. 8947 18 MOV [DWORD DS:EDI+18],EAX
004010B7 |. 8947 2C MOV [DWORD DS:EDI+2C],EAX
6 bytes
fill cursor handler
004010C6 |. 8947 1C MOV [DWORD DS:EDI+1C],EAX
3 bytes
48+5+6+3= 62 bytes

64 in your code
62 in mine.

+ because of data of wc in my code can not be spoiled I can use it for
some purpose in other windows code.
Posted on 2002-12-05 07:27:17 by The Svin

The macro for the prototype style I used is in the from version 7 of MASM32. The only reason why I used the alternative prototype form was that I had this example on this machine.

I also use the linker from the win98ddk as it has more capacity than the later versions, library production being just one of them.

I posted the 1536 byte example for a reason, it is built with dynamic code only and for its functionality, the alternative preloaded static data design does not build smaller. I think your approach is clever but it comes at the expense of a writable code section and a single global structure that hads to be overwritten each time it is used.

You can get an advantage if you only ever change the classname for a registered windows class but it falls off quickly when you must change window styles, icons, cursors, window memory, background colour etc ...

Now putting the embedded data before the start label is a good idea except that you lose the proximity to the code that uses it to save a single jump and the code is no smaller for doing so.

I have sold the view for some time that close range micro optimisation at the mnemonics level does not produce smaller exe files but it certainly can produce code that is hard to read and hard to maintain and even harder to extend into useful application code.

There are in fact places where you can take advantage of preset data and you can get an advantage in performance terms but loading the parameters into a WNDCLASSEX structure for a RegisterClassEx call is simply not one of them.

Posted on 2002-12-05 18:44:12 by hutch--
Further trivial size optimised code.


; --------------------------------
; zero fill structure in 12 bytes
; --------------------------------
lea edi, wc
xor eax, eax
mov ecx, 12
rep stosd

xor esi, esi ; store zero in ESI
mov edi, 400000h ; store EXE instance handle in EDI

mov wc.cbSize, sizeof WNDCLASSEX
mov wc.lpfnWndProc, offset WndProc
mov wc.hInstance, edi
mov wc.hbrBackground, COLOR_BTNFACE+1
mov wc.lpszClassName, offset szClassName

invoke LoadIcon,esi,IDI_ASTERISK
mov wc.hIcon, eax
mov wc.hIconSm, eax

invoke LoadCursor,esi,IDC_ARROW
mov wc.hCursor, eax


00401031 8D7DB0 lea edi,[ebp-50h] ; 12
00401034 33C0 xor eax,eax
00401036 B90C000000 mov ecx,0Ch
0040103B F3AB rep stosd

0040103D 33F6 xor esi,esi

; -------------------------------------
; used after loading structure as well, 5 bytes
; -------------------------------------
0040103F BF00004000 mov edi,400000h ; 5

00401044 C745B030000000 mov dword ptr [ebp-50h],30h ; 7
0040104B C745B403000000 mov dword ptr [ebp-4Ch],3 ; 7
00401052 C745B8FE104000 mov dword ptr [ebp-48h],4010FEh ; 7
00401059 897DC4 mov [ebp-3Ch],edi ; 3
0040105C C745D010000000 mov dword ptr [ebp-30h],10h ; 7
00401063 C745D800104000 mov dword ptr [ebp-28h],401000h ; 7
0040106A 68047F0000 push 7F04h
0040106F 56 push esi
00401070 FF1534204000 call dword ptr [LoadIconA]
00401076 8945C8 mov [ebp-38h],eax ; 3
00401079 8945DC mov [ebp-24h],eax ; 3
0040107C 68007F0000 push 7F00h
00401081 56 push esi
00401082 FF1530204000 call dword ptr [LoadCursorA]
00401088 8945CC mov [ebp-34h],eax ; 3

--------------------------------------------------total = 64 bytes

For 64 bytes, you save the extra bytes in the following CreateWindowEx call with the instance handle already being in EDI.

None of these things make the final EXE size any smaller and if you enable the equate in the one I posted to exclude code in the Wndproc, it will build bigger to the tune of 512 bytes.

Posted on 2002-12-05 21:55:32 by hutch--