Here is the code to multiply a number by 10

mul10_version1:
mov eax, 2 ; Number to multiply

mov ecx, 10 ; bytes: bb 0a 00 00 00
mul ecx ; f7 e3

Is the code below smaller and faster or am i wrong ?

mul10_version2:
mov eax, 2 ; Number to multiply

shl eax, 1 ; eax = eax*2 bytes: d1 e0
lea eax, ; eax = eax*5 bytes : 8d 04 80
; => eax = 20

is the version2 using correctly the pipelines U and V ? shl in U and lea in V ? If so, does it mean that the version2 costs 1 cycle ? (shl r32,imm = 1 cycle and lea = 1 cycle => shl in U and lea in V => 1 cycle) (sorry for the question, but i am a newbie with the use of pipelines U and V...)

thanks

Is there a smaller / faster solution ?
Posted on 2003-02-01 22:44:06 by DarkEmpire
I remember reading somewhere, maybe in Agners Fog optimization docs, that add eax, eax is better than shl eax, 1, don't ask me why.

Bye,
Posted on 2003-02-02 14:01:34 by El_Choni
It will most likely be faster than using mul but they won't pair. The second instruction depends on the result of the first one so there's no way you can execute them at the same time.

Thomas
Posted on 2003-02-02 14:25:34 by Thomas

Is the code below smaller and faster or am i wrong ?

mul10_version2:
mov eax, 2 ; Number to multiply

shl eax, 1 ; eax = eax*2 bytes: d1 e0
lea eax, ; eax = eax*5 bytes : 8d 04 80
; => eax = 20

is the version2 using correctly the pipelines U and V ? shl in U and lea in V ? If so, does it mean that the version2 costs 1 cycle ? (shl r32,imm = 1 cycle and lea = 1 cycle => shl in U and lea in V => 1 cycle) (sorry for the question, but i am a newbie with the use of pipelines U and V...)

thanks

Is there a smaller / faster solution ?

Code is faster then mul.
But of course, it can not be done it 1 cycle because
the second instruction depends on the first.

As to size both shl eax,1 and add eax,eax are 2 bytes
opcode.