I like to "re-heat" this good old question, even if it possibly cannot be answered 100 percent excat ...

I found some threads, explaining, that it is still worth using "xor eax, eax" instead of "mov eax, 0".

My question: is it also worth to use "xor eax, eax" and "push eax" instead of "push 0"? Seems, that the opcode is short enough on push 0.
Posted on 2004-01-05 08:13:12 by beaster
xor eax, eax is 2 bytes, while mov eax,0 is 5 bytes. In terms of size, it is an optimisation. :grin:

But push 0 is 2 bytes. push reg is 1 byte. Therefore it would only be considered a size optimsation if you need to push 0 onto the stack more than 2 times.
Posted on 2004-01-05 08:23:55 by roticv
I think for older cpus, the xor eax,eax is better, but for Athlon or later I think I read that it's better to use immediate for mov and push, because otherwise you'll be messing the status word, and this will take more time (to lock the statusword, write, ...)
Posted on 2004-01-05 08:24:28 by Ultrano
I am expecting to get an COM+
interface to the status word soon enough :tongue:

Soon "modern"CPU's are expected to:
- never check for flags/results of arithmetic operations
- never jump or call or return
- never call / return on conditions (everybody knows that is 8bits crap :D since Z80 has it and P4 does not)
- never shift bits left right only up/down :P
- never read video memory
- use very long and nice looking instructions that take forever to write like:
- use very simple and easy to understand mnemonics like: PKUSWTRNKGDWR XMM17 for "advanced" features
- do not use FPU and real numbers (integer rocks but only on 64+bits)
- do not increment as this is a C++ feature
and finally:
- never read/write memory
- do not use registers
- actualy never do anything, but do this very fast

Basically be designed by "born yesterday" young lawyers
Posted on 2004-01-05 09:29:42 by BogdanOntanu
Never heard of a Status Word. Perhaps you mean the System Flags Register (EFLAGS). MOVes and PUSHes do not change the flags. Ratch
Posted on 2004-01-05 09:31:57 by Ratch
xor eax, eax + multiple pushes is a size optimization, but I doubt many people really feel like doing this by hand, except working under extreme conditions (<=4k intros might be such a case, but I doubt people would bother for 64k... hell, afaik fr08 was coded in C/C++).

While some of the older tricks are bad on newer CPUs (something like using 2*<inc reg> instead of <add reg,2>), iirc the xor trick is still fine on P4... but consult the P4 optimization guide or ask BitRAKE ^_^

When coding asm, I still tend to use the XOR when zeroing a reg, but I don't bother doing silly micro-optimizations globally. When coding for speed, I dust off the P4 optimization guide (hell, I have other things to memorize than a bunch of optimization rules ;)), and play around.

I don't really care about athlon optimization - it's good enough at running generic code anyway, while the P4 has pretty major benefits from "proper" code - especially SSE/2 code :)
Posted on 2004-01-05 10:09:40 by f0dder
MOVes and PUSHes do not change the flags

yes, that's the purpose of not using xor eax,eax . Anyway, this can optimize only half a cycle I guess, and well, as f0dder says, it's silly
"the flags" /me slaps on the forehead lol too much FPU for me :grin:
Posted on 2004-01-05 11:30:09 by Ultrano
Im not an optomization nut, and dont know all to much about the "rules", but whenever i can i will try to format a register instead of setting a register to a constant if its within a few bites.

xor eax, eax
dec eax

instead of:

mov eax, 0FFFFFFFFh

As well, for long routines, i will sometimes set asside a register as the "Zero" reg (since nulls come up alot with Win32 api's) and use this register instead.

Just 2 Cents.... ;)

Posted on 2004-01-05 23:16:25 by NaN
NaN, as far as I have understood, that is one of the 'bad' things to do, speedwise. "or eax, -1" (the 1-byte immediate version) is the same size, and probably better. Oh well. *shuts my mouth and waits for bitrake*
Posted on 2004-01-06 00:37:23 by f0dder