Prefixes.

Open testopcode app.
Type MNEMONICS
or eax,-1
mov ecx,edx
and ebx,ebp
Insert OPCODE below typed commands
Do it this way
first byte 66
then type opcode of or eax,-1
that you can see in the place where you insert or eax,-1 menemonic
press Enter move cursor to nearest NOP
And create the same way 2 opcodes build with
66 + opcode of the rest to commands.

As you can see instruction that were generated that way
plactically identical to first 3 mnemonics except they
work with 16 registers.
or ax,-1
mov cx,dx
and bx,bp

66 is prefix.
Prefixes are first logical block in order of those
blocks that used to generate opcode.
Let's refresh our memory on those blocks and their order:
1. Prefixes
2. Code
3. byte mod r/m
4. byte sib
5. offset in command
6. imm. operand.
Remember also that not necesserelly all blocks are used in particular opcode
but any opcode has CODE block and order of other blocks is never changed.

Prefixes block is the easiest to understand, though it is unique in many hences.
1. It is the only block that may occur BEFORE code block.
2. Any prefix has 1 byte size.
3. There maybe several prefixes in one opcode.

Let us illustrate (3.)
Before it we look closely to prefix 66.
This prefix is "change default size of operand"
There might be only two "Default sizes of oprand"
16 bit and 32 bit.
In Win32 programming default size of operand is 32 bit.
When instruction may be used either with 16 or 32 bit operands
it is coded absolutly identically for both cases, the only difference
is if there is prefix 66 leading opcode.
Let see example were operand size is not seen as argument in opcode
but actually might be different size.
TYPE mnemomics with 1 opcode byte:
LODSB
LODSW
LODSD
Huh!
LODSW and LODSD have the same opcode. It is actually the same instruction
but LODSW uses word (which is NOT DEFAULT OPERAND SIZE in Win32)
and LODSD uses dword (which IS DEFAULT OPERAND SIZE in Win32)
So in this case we need to specify prefix 66h before opcode
and let us remember that specifying it we are NOT saying preocessor
"use WORD as operand"
we are saying
"use operand size OPPOSIT to DEFAULT"
if DEFAULT is DWORD processor seeing 66h uses WORD
elseif DEFAULT is WORD processor seeing 66h uses DWORD
Kinda trigger.
Opcodes that are using other than word\dword operands (bytes,qwords, etc.)
coded the same with any of current DEFAULT operand size.

TYPE MNEMONIC
mov al,0ff
mov al,cl
Look at opcodes
INSERT OPCODE below those two instructions
using opcode of them with leading 66 prefix.
66:B0 FF MOV AL,0FF
66:8AC1 MOV AL,CL
as you can see nothing has been changed in memomonics.

Keeping in mind that we can use 66 prefix only with words and dwords
we may assume that in case with opcode that is using bytes as operands
we placed 66 prefix in inapprepriate place.
And probably we've created illegal opcode?
Well, not exactly.
Run the code
66:B0 FF MOV AL,0FF
66:8AC1 MOV AL,CL

As you can see nothing bad has happened.
It worked the same way as it does with

B0 FF MOV AL,0FF
8AC1 MOV AL,CL

So next thing we can understand is:
If prefix that processor met can not be applyed to following opcode,
it is ignored.
Let's test it with next funny example.
Other type of prefix is rep prefix, prefix that is used
to make processor repid following it instruction ecx(cx) times.
opcode for inc eax is 40h
let's try to use prefix F3 (rep) to make this instruction repid 3 times.
If you have td32.exe this example is easier to run in it.
Open testopcode app in td32.exe
type in it
xor eax,eax
mov ecx,3
rep inc eax
then run it.
You can see two thigs
1. Value in eax = 1. It means that prefix F3 didn't work and was ignored.
2. Nothing bad happend. No exeptions etc.
In OllyDbg you will have two problems
1. If you type inside it mnemonic
rep inc eax
It would tell you "unrecognized command".
But you are super lowlevelmachine coder. Aren't you? :)
It can't stop you anymore. 'Cause instead of
rep inc eax
you can insert opcode
F3 40
OllyDbg still doesn't recognize it.
If you use F8 to step trough typed code, you will have a problem on
the F3 40 opcode. Try it.
But you can check it other way.
Set brake point somewere below (for example on 3rd nop after inserted
instructions)
To set brakepoint: dblclick on line where you want to place it.
Address part of line should become red.
(to remove it dblclick on it again. It works like a trigger)
Then use F9 (run) to make processor run through line where you placed
F3 40 opcode.
You can see the same results.
1. eax =1
2. nothing bad has happend, processor just ignored 66 prefix that was
placed before unappropriate opcode.

IF PREFIX CANNOT NOT BE USED WITH OPCODE IT LEADS THE PREFIX IS IGNORED.
Type mnemonic that need to be spefied with 2 prefixes
REP LODSW
66:F3:AD REP LODS

You can see 2 prefixes here 66 (change default operand size) and F3 (rep)
Next note then:
ONE INSTRUCTION CAN HAVE SEVERAL PREFIXES.
It's another unique thing about this 1st block - prefixes.
Any opcode may have only one CODE block, one mod r/m, offset etc.
But may have several prefixes.

Now final note about prefix 66h.
You may wrongly assume that when you in real mode default size of operand
is WORD, and when you are in protected mode it's DWORD.
Not quite so.
The only thing you can remember for sure: when you code Win32 programm
DEFAULT size is DWORD.
So with any instruction that you use in you Win32 app operating words you
are creating opcodes that 1 byte bigger (on prefix 66) and takes 1 more clock
to execute. It's not always bad, but you need get used to calculation of math
model of your opcode size generation and approximated speed calculation to
see if it's worthy to use words here. Any instruction with words will cost you
1 more byte and 1 more clock to decode prefix.
As to how default size is specified - it's spesified by bit D in segment
descriptor. In real mode it always assumed 0. So in real mode DEFAULT size
is always WORD indeed. In protect mode bit D might be 0 or 1 (in Win32 apps
it's 1).
SO
if (PROTECTED MODE && BIT D==1)
AD = LODSD
66 AD = LODSW
else
AD = LODSW
66 AD = LODSD
So we know from previous tutorials that
some different opcodes may have the same name(mnemonic)
one opcode may have several names(mnemonics)
Now we see type of opcode wich may mean 2 different things depending on
some conditions.
Regarding opcodes that can be used with 66 prefix, you can see those
conditions and meanings above.

Next time we'll continue to learn about 1st logical block of OPCODE - prefixes.
Meanwhile I ask to recommend readers reference manuals on instruction set
for x86, where all mnemonics described with opcode. And not just with opcodes
but with opcodes structurized in its logical parts - blocks.

I have many manuals, but none of them even close to be perfect to meet the
conditions above (including Intel manuals).
As work manual I have one reference paper book by russian author V.Yurov
but still it has lots of errors so it havilly corrected by me,
and we are continuesly discussing those errors with the author by email,
he promisses to fix them but when it will happen nobody knows.
The rest reference I have are even worse. I'll try to fix this reference problem
in my book but I have no idea when it'll be finished.
Untill that time please recommend readers some manual with opcode reference
wich is at least the best among existing.

Learning other then prefix blocks need to be done with that kind of reference.
Opcode structurized in blocks.
Posted on 2002-11-20 04:20:54 by The Svin