Hola,

When we have multiple calls to CreateWindowEx most of us probably choose to wrap it into an proc by itself and call it only with the needed params to keep the code easy readable.

Like following:

invoke Window, hwnd, x , y , lenght, width, styles, etc..

Window proc hwnd:DWORD, xpos:DWORD, yPos:DWORD, etc

But this gets translated into quite a lot of pushes, right ? Like:

push etc
push styles
push width
push lenght
push y
push x
push hwnd
call OwnProc

:OwnProc
push ..
push ..
push ..
push etc
push more pushes
call CreateWindowEx

What if now - we would use a struct which we would fill in and just pass the struct pointer when calling our own OwnProc ?

Like:

CWStruct struct
x DWORD ?
y DWORD ?
hwnd DWORD ?
styles DWORD ?
hInstance DWORD ?
ClassName DWORD ?
etc DWORD ?
CWStruct ends

and just fill it in once and then just update the needed members of that struct ? Which most of the times will be only the x or y Position.

Wouldn't that dramatically decrease the size of the executable if there are like 20 CheckBoxes needed ?

We could even create some extra space in our struct for the case the window would be ownerdrawn. And look out for that style in our OwnProc and react approprietely (Like giving it an own WindowProc and similar)

Are there possibilities? What do you guys think ?
Posted on 2002-08-07 10:24:39 by JimmyClif
yes!

nice idea.
Posted on 2002-08-07 11:37:54 by stryker
The CreateWindowEx parameters still need to be PUSHed onto the stack. I save a couple of bytes by using a CREATESTRUCT, e.g.:



cwe CREATESTRUCT \
< 0,\ ; lpCreateParams
0,\ ; hInstance
0,\ ; hMenu
0,\ ; hwndParent
CW_USEDEFAULT,\ ; cy
CW_USEDEFAULT,\ ; cx
CW_USEDEFAULT,\ ; y
CW_USEDEFAULT,\ ; x
myStyle,\ ; style
myWindowName,\ ; lpszName
myClassName,\ ; lpszClass
myExStyle > ; dwExStyle


in combination with the following CreateWindowEx wrapper:



option prologue:none
option epilogue:none

CWE proc lpCreateStruct:DWORD

mov eax, [esp+1*4] ; EAX = lpCreateStruct
mov ecx, (SIZEOF CREATESTRUCT) shr 2 ; ECX = no. of DWORDs to push
lea eax, [eax + ecx*4] ; EAX = pointer past end of struct
neg ecx ; ECX = -ECX

@@: push [eax + ecx*4]
inc ecx
jne @B

call CreateWindowEx
ret 1*4

CWE endp



So there is only one PUSH command (executed in a loop). Further, the CREATESTRUCT can be re-used for many windows, changing only the required parameters.

Is that what you are after?
Posted on 2002-08-07 14:18:33 by Frank
Yes, I was wondering if I just invented a whole new perspective about saving bytes ;) Appearantly I did not :grin:

And I thought to use this maybe at the end:



mov edx,lpCreateStruct
mov eax,[edx].CWE.myStyle
and eax,BS_OWNERDRAWN ;for buttons
cmp eax,[edx].CWE.myStyle
jz @is_ownerdrawn
;Here another compare to see if it's a Static Window etc...
ret

Any better way to see if any of the many ownerdrawn styles is set ?


This is basically creating a whole wrapper around CreateWindowEx ;)

...and .. nice approach you have there :)
Posted on 2002-08-07 14:29:25 by JimmyClif
I think this should work alright:



mov edx,lpCreateStruct
test [edx].CWE.myStyle,BS_OWNERDRAWN or CBS_OWNERDRAWFIXED or ...
jnz @is_ownerdrawn


I'm not sure if some of the xx_OWNDERDRAWN flags overlap with other styles though...

--Chorus
Posted on 2002-08-07 19:26:53 by chorus
Just for the fun of it, and because this thread's title starts with the word "Theory" -- we can simplify things further by defining a new structure:


CWESTRUCT STRUCT

dwExStyle dd ?
lpClassName dd ?
lpWindowName dd ?
dwStyle dd ?
x dd ?
y dd ?
nWidth dd ?
nHeight dd ?
hWndParent dd ?
hMenu dd ?
hInstance dd ?
lpParam dd ?

CWESTRUCT ends

and using it in combination with this CreateWindowEx wrapper:


option prologue:none
option epilogue:none

CWE proc lpCWESTRUCT:DWORD

mov eax, [esp + 1*4] ; EAX = lpCWESTRUCT
mov ecx, (SIZEOF CWESTRUCT) shr 2 ; ECX = no. of DWORDs to push

@@: dec ecx ; push CWESTRUCT onto stack (by value)
push [eax + ecx*4]
jne @B

call CreateWindowEx
ret 1*4

CWE endp

I don't think that repeated window creation can be done in fewer bytes.

For the OWNERDRAWN styles, I assume that chorus' solution would be the best.
Posted on 2002-08-08 09:37:17 by Frank
Frank, may be a bit faster?



option prologue:none
option epilogue:none

CWE proc lpCreateStruct:DWORD

mov edx, [esp] ; return address
mov ecx, (SIZEOF CREATESTRUCT) shr 2 ; ECX = no. of DWORDs to push
mov eax, [esp+1*4] ; EAX = lpCreateStruct
add esp, 2*4 ; balancing the stack

@@: push [eax + ecx*4-4]
dec ecx
jne @B
push edx
jmp CreateWindowEx
CWE endp


To avoid the loop you can use MMX.
Posted on 2002-08-08 10:20:32 by lingo12
How about something like this?



CreateWindowIndirect PROC uses ecx edi esi lpCWESTRUCT:DWORD
mov ecx,sizeof CWESTRUCT shr 2
lea edi,esp-sizeof CWESTRUCT
mov esi,lpCWESTRUCT

rep movsd
sub esp,sizeof CWESTRUCT
call CreateWindowEx
ret
CreateWindowIndirect ENDP


--Chorus
Posted on 2002-08-08 12:09:34 by chorus
Lingo12, your proc is great but does crash. The reason is that you have the return address above the 12 CreateWindowEx parameters where it should be below them. Look at this proc to see the difference:


option prologue:none
option epilogue:none

CWE proc lpCWESTRUCT:DWORD

pop edx ; EDX = return address
pop eax ; EAX = lpCWESTRUCT
mov ecx, (SIZEOF CWESTRUCT) shr 2 ; ECX = no. of DWORDs to push

@@: dec ecx
push [eax + ecx*4]
jne @B

push edx ; last thing to push is the return address :-)
jmp CreateWindowEx

CWE endp

Nevertheless, thank you for the the idea of JMPing to CreateWindowEx instead of CALLing it. :alright:

Posted on 2002-08-08 12:16:10 by Frank
Chorus, I think your procedure should work.

However, with the USES ECX ESI EDI declaration, your proc adds three register PUSHes as well as three register POPs. They are not visible in the source code, but will be visible in a disassembly. That is six additional bytes over other solutions.

For the REP MOVSD, Agner Fog's Pentium optimization manual discusses a couple of conditions that need to be met to make it fast. One of them is that the count (ECX) should be >= 64, which is not the case here (the count here is 12). Therefore I guess that a loop is faster than REP MOVSD in the present case. Of course that's theory, I have not timed the various procedures that were proposed here.

IMHO, your solution is good, but may be better suited to tasks where larger amounts of data need to be stored on the stack.
Posted on 2002-08-08 12:45:50 by Frank
Frank, thanks for correction.

chorus, OK with rep movsd:


option prologue:none
option epilogue:none

CreateWindowIndirect PROC lpCWESTRUCT:DWORD
mov ecx, [esp] ; ecx->return address
mov edx, esi ; save register esi
mov esi, [esp+1*4] ; esi-> source buffer
mov eax, edi ; save register edi
cld ; clear direction flag
sub esp, sizeof CWESTRUCT-4 ; prepare the stack
lea edi, [esp+4] ; edi->destination address
mov [edi-4], ecx ; ecx->return address
mov ecx, sizeof CWESTRUCT ; ecx->counter
rep movsd ; copy
mov edi, eax ; restore edi and esi
mov esi, edx ; registers
jmp CreateWindowEx ; call API
CreateWindowIndirect ENDP

It is better to use the stack safety and skip the slow CALL and RET 4.
Posted on 2002-08-08 13:34:14 by lingo12
This is what I was thinking instead of my original movsd version:



CWE PROC lpCWESTRUCT:DWORD
pop edx
pop eax
push edi
mov ecx,sizeof CREATESTRUCT shr2
sub esp,sizeof CREATESTRUCT
mov edi,esp
xchg esi,eax
rep movsd
xchg esi,eax
pop edi
push edx
jmp CreateWindowEx
CWE ENDP


Still bigger than the other one's posted. Besides I think there's a couple changes we can make to the last routine Frank posted that'll make it slightly faster and smaller:



option prologue:none
option epilogue:none

CWE proc lpCWESTRUCT:DWORD

pop edx ; EDX = return address
pop eax ; EAX = lpCWESTRUCT
xor ecx,ecx
mov cl,(sizeof CWESTRUCT-4) shr 2
mov ecx, (SIZEOF CWESTRUCT) shr 2 ; ECX = no. of DWORDs to push

@@: push [eax + ecx*4]
dec ecx
jns @B

push edx ; last thing to push is the return address :-)
jmp CreateWindowEx

CWE endp


Basically, change up the loop so we (hopefully) won't suffer an AGI stall from the dec ecx/push . Also , replaced mov ecx with xor ecx,ecx,mov cl,value cause it's one byte smaller ;)

--Chorus

<EDIT>
removed extra line. Thanks for spotting that Frank ;) I overlooked it
</EDIT>
Posted on 2002-08-08 13:39:33 by chorus
Excellent, guys! :grin:

Chorus, in the procedure with the loop that you optimized, there is one line left over ("mov ecx, ...") that you may wish to delete.
Posted on 2002-08-08 14:46:57 by Frank
Frank I don't think so!
We forget about structure initialization.
So, first we must initialize the structure and next
to call another procedure to copy data into stack
and call API. But it is slow and will be better to put directly
data in the stack when we initialize it and jmp directly
to first API. IMHO If we have more APIs in the stack they
will be run automatically.
Posted on 2002-08-08 15:10:27 by lingo12
Just be careful - this stuff isn't threadsafe.
Posted on 2002-08-08 16:05:19 by f0dder
Lingo12, I guess it's a trade-off between speed and size.

The loop method initializes the stack with "push /dec ecx/jns @B". That is 6 bytes altogether in the .CODE section.

The manual methods "invoke CreateWindowEx, params 1 to 12" and "push param1/push param2/.../call CreateWindowEx" involve 12 PUSHes to initialize the stack, each single one of them at the cost of 1 to 5 bytes in the .CODE section (depending on what exactly is being pushed). That is much more than 6 bytes altogether.

More importantly, each time you use one of the manual methods, you have to do all of the twelve PUSHes. So your costs in terms of bytes in the .CODE section increase each time you use the manual methods. Not so with the loop method: here you just change one or two parameters, then call the same old loop procedure once more. (NOTE: this is of course slighly simplified, for illustrative purposes)

Conclusion: If program size doesn't matter or if the program does not have many windows/controls, then the manual methods may be preferable. But if the goal is to deliver a tiny program that has a lot of controls, then the loop method has its place.
Posted on 2002-08-08 17:57:55 by Frank