mmmmm... i am thinking.. for what you.. clean all my post?? i can think by 2 mins aprox?? (have a specific time for write a post??) :S :(



i dont gonna rewrite all the post...

ok... ok.. i write all i remember.. (this is.... crazy)

ok, the question is for what a unaligned operation in the stack can cause some like the vars , directions or str passed to a msgbox dont display correctly???

unaligned data.. only take more clock cycles, but what whit the stack.. the stack can take unaligned operations.. but for what win can not??? is a bug is a feature?? is a wath?? is E.T. in my comp????? :)

yep yep.... ok.. i short a little my post.. i think for time.. this think make some like reaload by self and erase.. my original post :S

ok here is the correct exe... in the next post the "no" correct exe.. i ee equal only unaligned stak ... but i think this cause only more cycles????.. what you think???

have a nice day ... :)

correct exe
Posted on 2003-02-12 01:42:53 by rea
you read.. the only difernce is unaligned stak ...

i say have a nice day??? lol, then have nice thoughts, ideas.. you dont have the word ideas in your lang??.. you have idea :) i think ;)

c ya, bytes.

a yep, this "supuestamente" incorrect
Posted on 2003-02-12 01:45:40 by rea
it is very dificult to meet with you :)

try to see the post

http://www.asmcommunity.net/board/index.php?topic=10687


i will be at 2:00 night at irc PTnet
Posted on 2003-02-14 09:48:02 by Nguga
okay.. I debugged your code hgb it was very compact.. I think because you didn't register any classes. I was pretty impressed by the enter command but.. I haven't used that before. Anyway I did a debug comparison on the .break .if statement compared to the cmp eax,0 jle .nogo syntaxes and I found that there really isn't a great deal of extra code run.. as far as it needing to jump to a near address, well I can't see that taking up too much extra time either. Perhaps it does.. I'll have to look at both of you're examples first. There is one thing though, when I used the .break in debug it used or eax,eax. I have heard before that this isn't as efficient as cmp because of the architecture of the intel cpu. From what I remember the cmp instruction is stored almost on top of the conditional jumps. Are there any tools that you can use to do a time comparison. I wouldn't mind finding out if or is a better option than cmp before a conditional jump.

cheers

oh and also what did you use to code those examples.


the code that was output by the debugger came out like this


This is the basic program which uses .break syntax
notice on line 004010F9 it uses a or instruction. I don't know why.

004010EA |> 6A 00 /PUSH 0 ; /MsgFilterMax = 0
004010EC |. 6A 00 |PUSH 0 ; |MsgFilterMin = 0
004010EE |. 6A 00 |PUSH 0 ; |hWnd = NULL
004010F0 |. 8D45 B4 |LEA EAX,DWORD PTR SS: ; |
004010F3 |. 50 |PUSH EAX ; |pMsg
004010F4 |. E8 5F000000 |CALL <JMP.&USER32.GetMessageA> ; \GetMessageA
004010F9 |. 0BC0 |OR EAX,EAX
004010FB |. 74 14 |JE SHORT basic.00401111
004010FD |. 8D45 B4 |LEA EAX,DWORD PTR SS:
00401100 |. 50 |PUSH EAX ; /pMsg
00401101 |. E8 76000000 |CALL <JMP.&USER32.TranslateMessage> ; \TranslateMessage
00401106 |. 8D45 B4 |LEA EAX,DWORD PTR SS:
00401109 |. 50 |PUSH EAX ; /pMsg
0040110A |. E8 43000000 |CALL <JMP.&USER32.DispatchMessageA> ; \DispatchMessageA
0040110F |.^EB D9 \JMP SHORT basic.004010EA

This is the code for the practise examp. I did with the cmp instruction. Notice on line 004010F9 it uses the cmp
instruction. There is also another jmp short at 00401110. This could be where the code may be slower. i.e. if a short jump takes extra clock cycles. As far as I can see, the cost of error correcting is to take this short jmp. Seeing as how it is a loop but, it does add a lot of extra time to the program as it runs.

004010EA |> 6A 00 /PUSH 0 ; /MsgFilterMax = 0
004010EC |. 6A 00 |PUSH 0 ; |MsgFilterMin = 0
004010EE |. 6A 00 |PUSH 0 ; |hWnd = NULL
004010F0 |. 8D45 B4 |LEA EAX,DWORD PTR SS: ; |
004010F3 |. 50 |PUSH EAX ; |pMsg
004010F4 |. E8 8B000000 |CALL <JMP.&USER32.GetMessageA> ; \GetMessageA
004010F9 |. 83F8 00 |CMP EAX,0
004010FC |. 7E 14 |JLE SHORT a_practi.00401112
004010FE |. 8D45 B4 |LEA EAX,DWORD PTR SS:
00401101 |. 50 |PUSH EAX ; /pMsg
00401102 |. E8 A7000000 |CALL <JMP.&USER32.TranslateMessage> ; \TranslateMessage
00401107 |. 8D45 B4 |LEA EAX,DWORD PTR SS:
0040110A |. 50 |PUSH EAX ; /pMsg
0040110B |. E8 6E000000 |CALL <JMP.&USER32.DispatchMessageA> ; \DispatchMessageA
00401110 |. EB 17 |JMP SHORT a_practi.00401129
00401112 |> 74 13 |JE SHORT a_practi.00401127
00401114 |. 6A 00 |PUSH 0 ; /Style = MB_OK|MB_APPLMODAL
00401116 |. 68 34304000 |PUSH a_practi.00403034 ; |Title = "No Window Handle"
0040111B |. 68 1C304000 |PUSH a_practi.0040301C ; |Text = "Incorrect Parent Handle"
00401120 |. 6A 00 |PUSH 0 ; |hOwner = NULL
00401122 |. E8 6F000000 |CALL <JMP.&USER32.MessageBoxA> ; \MessageBoxA
00401127 |> EB 05 |JMP SHORT a_practi.0040112E
00401129 |> 8B45 BC |MOV EAX,DWORD PTR SS:
0040112C |.^EB BC \JMP SHORT a_practi.004010EA
Posted on 2003-12-01 03:56:48 by Phase Verocity
From what I remember all that a jmp instruction does is put an address into the ip register. When the instructions are run they always go to the next command at the address of the ip register. Because of this I cannot see that it would take more clock cycles than the actual jmp instruction itself. So I guess what it comes down to then is finding out how many cycles the jmp instruction uses.. I can't imagine too many, seeing as how it is one of the most commonly used instructions.
Posted on 2003-12-01 04:08:11 by Phase Verocity
About the what is fast, dunno, but what is short, is: or eax, eax than cmp eax, 0.
For take a little about timming you can search for profiler in this board.

I use nasm and for the linker alink for others see atalink and others .


Yes I not register any class, i only call the procedure inside my app, that make in the first example a aligned substraction (this is some like store space for locals vars) is a multiple of 4, in the second example I dont the same, but with a substraction that the "minuendo" is not multiple of 4, if you run the examples, the two examples are in fact equal.

The only real diference is in the number that i substracted to the esp in the second is 2E or 46 46/4 = 11.5, that missalign will cause some that I consider extrange.


Nice day or night.
Posted on 2003-12-02 14:01:01 by rea
hgb,
About the what is fast, dunno, but what is short, is: or eax, eax than cmp eax, 0.

OR EAX,EAX is a two byte instruction, but it has to write the result back into EAX. That will probably make it take longer. CMP EAX,0 is a three byte instruction with no write back. CMP EAX,EBP where EBP=0 is a two byter with no write back. TEST EAX,EAX is a two byter with no write back. You make the judgement. Ratch
Posted on 2003-12-02 21:21:39 by Ratch

unaligned data.. only take more clock cycles, but what whit the stack.. the stack can take unaligned operations.. but for what win can not??? is a bug is a feature?? is a wath?? is E.T. in my comp?????

There's probably some code that, for some reason (probably efficiency?), depend on the stack being 4-byte aligned. I'd say it's a reasonable assumption to make under a 32bit os :)
Posted on 2003-12-03 03:59:17 by f0dder