Hello all, my first post here so please be gentle :)

I'm having trouble encoding the MOVE instruction properly. It seems that when EBP is used as a base in an effective address, things go a little crazy. The manuals on sandpile.org are a little confusing to say the least, and have "#1" in the place I would expect EBP to be (on the SIB byte page).

I tried assembling with MASM and then disassembling to check out the opcodes, but it still doesn't make sense. It seems that something totally different comes out.


mov ebx,[ebp+ecx*2] ;8B5C4D00

But surely 5Ch is Reg+DWord with EDI as the destination? And what's that extra 00h for?

When I'm encoding other effective addresses that don't use EBP as the base, it comes out fine - so this must be a special condition (the sandpile pages would agree with me).

Thanks for any help,
Will.
Posted on 2004-07-15 06:22:04 by
Forgive me but it has been long.. so..

First thing you need to do is view 4Dh in binary. It is 01001101b which means that the scale is 2, index is ecx and base is ebp, but... if mod = 01b and the base is ebp, then it would be encoded as ebp + signed byte....

Since when is 5Ch reg+dword? It is only reg+dword if the mod is 10b, but the mod is not 10b, but instead 01b.
Posted on 2004-07-15 07:55:59 by roticv
Hi ChiefRB,

The instructions without the prefix bytes are encoded in max. 5 field. The first 3 fields are bytes, the next 2 fields are bytes, words or doublewords. As for the instruction
mov ebx,[ebp+ecx*2]

the encoded form is the following:


Opcode | ModR/M | SIB | Displacement
=========|=============|===============|=========
1000101w | mod reg r/m | ss index base | xxxxxxxx
---------|-------------|---------------|---------
10001011 | 01 011 100 | 01 001 101 | 00000000

[I]w[/I]: 1 - Word/DoubleWord operation
[I]mod[/I]: 01 - EBP+disp8, the opcode contains also a displacement byte
(it's not possible to use only EBP)
[I]reg[/I]: 011 - BX/EBX register
[I]r/m[/I]: 100 - [--][--], the ModR/M is followed by a SIB byte
[I]ss[/I]: 01 - scale index with 2
[I]index[/I]: 001 - index is ECX register
[I]base[/I]: 101 - base is EBP register
and at the and the [I]disp8[/I] byte with 0, because there is no displacement.
Posted on 2004-07-15 08:03:21 by bszente
roticv - Ah yes, I missed the leading 0 on there. Thanks :)

bszente - Thanks for the explanation!

I'm actually encoding the instruction myself, and when using other registers except EBP it works fine. This part:
mod: 01 - EBP+disp8, the opcode contains also a displacement byte
(it's not possible to use only EBP)
is I suppose what my question was - why isn't is possible to just use EBP? Why must we have a single displacement byte?

The more I look into MOV the more of these odd little exceptions to the rule I find. For example:


mov eax, [addr]
has it's own opcode (i would guess for speed since this is a common operation). My routine to encode MOV is becoming gargantuan :)
Posted on 2004-07-15 08:16:09 by
It is just like xchg eax, edx can be encoded either as 2 bytes or 1 byte.
Posted on 2004-07-15 09:18:46 by roticv
The only possibility to use the EBP register as base in scaled&based addressing is to put that "dummy" 0 displacement.
In fact only the MOV register, instruction can be encoded, and displacement is 0 in your case. The explanation is the following:

1. The ModR/M byte's reg field should be 011 for EBX. This is ok.
2. The ModR/M byte's r/m field should be 100 to allow the use of the SIB byte, that is for the scaled-index-based addressing, what you are using.
3. The SIB byte's 3 field is simple: ss is 2 (the scale), index is ECX (001), and if you take a look in the SIB table you will see, that there isn't value dedicated to EBP as base. You will see the #1 column with value 101. If you take a look at the bottom for notes, you will see that for using EBP as base the ModR/M byte's mod field should take 01 or 10. In the case of 00 only a direct DWord is considered as base. And you wouldn't use 10 either, because you would have to put 4 zeros (as DWord) and not only a single 0 byte. That's the explanation of the "useless" 0 byte.
4. So the ModR/N byte's mod field should be 01, and you put the "dummy" 0 byte. In the case of the other registers you may use 00 for the mod field and in this case you don't have to put the displacement bytes.

And of course, as you said there are several instructions using the accumulator register (EAX) with shorted form, that's why it is recomended to use the accumulator as much as possible.
Posted on 2004-07-15 10:28:53 by bszente
That makes sense bszente, thanks again. Back to encoding! :)
Posted on 2004-07-15 10:32:31 by