What is the faster : use of mov or use of push / pop

For exemple:

mov eax, 4 => opcode : b8 04 00 00 00 => 5 bytes

push 4 => opcode : 6a 04
pop eax => opcode : 58
=> 3 bytes
On the other hand:

mov eax, ebx => opcode : 8b c3 => 2 bytes

push ebx => opcode : 53
pop eax => opcode : 58
=> 2 bytes

So for the same result eax = 4 : it is 5 bytes with mov and 3 bytes with push / pop
and eax = ebx is the same amount of bytes whatever the solution...

So the questions : what is faster mov or push / pop

An other one, i am a newbie in optimization, so where can i find a document / websites / etc. which show all (a lot of) possible optimizations...


PS : Sorry for my english
Posted on 2003-01-22 16:55:38 by DarkEmpire
Mov is certainly faster.

Check out Agner Fog's site
Posted on 2003-01-22 17:02:51 by Knightmare
Do not make the assumption that in isolation that one opcode combination is automatically faster than another, it is very hardware dependent and it is code sequence dependent.

Pipeline considerations, cache considerations, instruction loop length, branching etc ... all need to be taken into account. If you are writing an algorithm where speed matters, be prepared to try alternation between different opcode combinations and benchmark them to tell the difference.

PUSH/POP has the advantage of not using a register and this often helps in an algorithm where you get faster code by having an extra variable directly in a register.

Directly after an instruction that cannot be paired, it just does not matter and you can often fill the hole with another instruction so you reduce the loss.

Try and think of code design this way, instead of looking at loop code in isolation,

; looped code here
dec ecx
jnz label

Think of it as an EIP sequence,

; looped code here
dec ecx
jnz label
; looped code here
dec ecx
jnz label
; looped code here
dec ecx
jnz label

etc ....

This is what the processor sees and this is what you have to get through the dual pipelines of later processors.


Posted on 2003-01-22 17:14:16 by hutch--
Wasn't he talking about using a push and a pop to only move a register to another?
Posted on 2003-01-22 17:19:58 by Knightmare
"Push register
A push register instruction generates 3 mops. The first one (port 4) is a store instruction, reading the register. The second mop (port 3) generates the address, reading the stack pointer. The third mop (port 0 or 1) subtracts the word size from the stack pointer, reading and modifying the stack pointer.

Pop register
A pop register instruction generates 2 mops. The first mop (port 2) loads the value, reading the stack pointer and writing to the register. The second mop (port 0 or 1) adjusts the stack pointer, reading and modifying the stack pointer.

If you have consecutive POP instructions then you may break them up to reduce the number of mops:
POP ECX / POP EBX / POP EAX ; can be changed to:

The former code generates 6 mops, the latter generates only 4 and decodes faster. Doing the same with PUSH instructions
is less advantageous because the split-up code is likely to generate register read stalls unless you have other instructions
to put in between or the registers have been renamed recently. Doing it with CALL and RET instructions will interfere with
prediction in the return stack buffer. Note also that the ADD ESP instruction can cause an AGI stall in earlier processors." by A.Fog

mov [esi+4], eax ; 2 micro-ops
mov eax, [esi+4] ; 1 micro-ops

push eax ; 3 micro-ops
pop eax ; 2 micro-ops

"29. List of instruction timings and micro-op breakdown for PPro, PII and PIII
Tabl .29.1 Integer instructions" by A.Fog
Posted on 2003-01-22 22:12:39 by lingo12