Addressing Modes
From ASM Book
Building memory addresses
We often work with blocks of data spanning several memory addresses. For example, a dword is stored in memory as four consecutive bytes. We pick one of the addresses as a reference point - the base address. This is usually the lowest address of the data block. (For an example where this is not true, see the stack frame in The Stack.)
If we have a large data block (say a 20-byte data structure), but only need to look at a few bytes of data embedded in it, then we can locate the data by adding an offset or a displacement to the base address. Thus the calculated address (or effective address) is the sum of a base address and a displacement.
Varying the address
Both the base address and displacement can be constant. The assembler will combine these two into a single value, the direct address.
Either the base or the displacement (or both) can be varied. We do this on the x86 by loading a register with the variable part.
A register loaded with a base address acts as a base register. This is the basis for indirect and based addressing.
A register loaded with a displacement is often called an index register. This is the basis for array or indexed addressing.
Scaling the index
In a HLL, an array index with a value n is used to access the n-th array item (or element). At the machine level, we need to convert this index into a displacement. We do this by multiplying the index by the item size. (See All About Arrays for added details.) This computation is called scaling. The x86 has the built-in capacity to scale the value of one register (by 1, 2, 4, or 8) before computing the effective address.
The CPU doesn't care
The x86 is capable of adding together three numbers (one constant, two variable) to create an address. The CPU doesn't care which number is the base address. It only cares that the final value is a valid address. And in the case of the LEA instruction, the address doesn't need to be valid at all. The last case is the reason you may see code that performs nonaddress arithmetic with the LEA instruction.
Doing it in assembly
Constant (static) base only -- this is also known as "direct addressing"
MASM: mov eax,dword_data FASM: mov eax,[dword_data] HLA: mov( dword_data, eax );
Constant (static) base + constant displacement -- this is also "direct addressing"
MASM: mov eax,dword_data+4 FASM: mov eax, [dworddata+4] HLA: mov( dword_data[4], eax );
Constant (static) base + scaled index
MASM: mov eax,dword_data[ecx*4] FASM: mov eax,[dword_data+ecx*4] HLA: mov( dword_data[ecx*4], eax );
Constant (static) base + double indexing
MASM: mov eax,dword_data[ebx+ecx*4] FASM: mov eax, [dword_data+ebx+ecx*4] HLA: mov( dword_data[ebx+ecx*4], eax );
Variable base only -- this is also known as "indirect addressing"
MASM/FASM: mov eax,[ebx] HLA: mov( [ebx], eax );
Variable base + constant displacement -- this is also known as "based addressing"
MASM: mov eax, 4[ebx] mov eax,[ebx+4] FASM: mov eax,[ebx+4] HLA: mov( [ebx+4], eax );
Variable base + scaled index
MASM/FASM: mov eax,[ebx+ecx*4] HLA: mov( [ebx+ecx*4], eax );
Variable base + scaled index + constant displacement
MASM: mov eax,24[ebx+ecx*4] mov eax,[ebx+ecx*4+24] FASM: mov eax, [ebx+ecx*4+24] HLA: mov( [ebx+ecx*4+24], eax );
A.
xor eax, eax mov eax, [ebx*4+eax]
or
B.
mov eax, [ebx*4]
Most people will think that A will be longer in size, but in fact it is wrong. A is 5 byte (xor eax, eax = 33C0, mov eax, ~[[ebx4+ eax]] = 8B0498) while B is 7 bytes (mov eax, ~[[ebx4]] = 8B049D 00000000). This is because when sib is encoded, the only time that the index is nulled is when the displacement is 4bytes. For more about it you have to learn the opcode format.
