In previous tutorial we learnt general things conserning
to all operration with prefixes.
1.All prefixes are 1 byte size.
2.One opcode may have several prefixes.
3.If prefix can not be used with following opcode, it is ignored
by processor.
As example we study meaning and use of prefix 66h - change
default operand size.
Now we need just study other prefixes and see if there something
interesting about them that can affect performence of our apps.
There are 5 types of prefixes
1. Change default operand size (66h)
2. Change default address size (67h)
3. Rep. prefixes (F3,F2)
4. Segnment specifying (also called change DEFAULT segment)(2e,36,3e,26,64,65)
5. Bus lock (F0)
Prefix 66h
We learnt it in previous tutor.

To test yourself answer a question:
wich one of the following instructions will have 66 prefix in 32bit app:
The check your answer typing those opcodes in debugger.

Prefix 67h.
Change default ADDRESS size.

mov al,
you can see
8A00 MOV AL,
now insert the OPCODE with leading byte 67.
67:8A00 MOV AL,

1. You can see that byte is now addressed by 16 bit registers
In 32bit addressing mode all things that can specify address
are 32bits 32bit base,32bit index,32bit offset.
All those things in 16bit mode are 16 bits.
2. You also can notice that address part not just changed to
mov al,
the address components are also changed instead of we can see
now .
Why not just ?
We'll learn it in depth when it comes to study blocks and .
For now we can say that in 16 bit mode addressing we can not use all registers
as index and base that we can use in 32 bit addressing mode.
And fields of opcode that specify those addressing registers are differ
in 16bit to 32bit addressing modes.
I doubt if you often (if ever) will use 16bit addressing in 32bit apps.
Nevertherless we learn about it, 'cause the less black gaps about knowlege
of opcodes the better.
Using 32bit addressing mode in 16bit apps in opposite might be quite effective
as long as you control that summory address not grow over FFFFh offset.

Rep. prefixes F3,F2.

If you know about chain operations (such as movs,scas,lods etc.)
you know when and how prefixes rep\repe\repne are used.

You also know that some of chain instruction are supposed to be used with
only rep prefix (such as movsb, loadsb) and chain operation is terminated
when counter (e(cx)) = 0
and some of them are used with either repe or repne when execution can
be terminated to only when counter reaches 0 but also when flag ZF does not
meet condition specified in prefix.
In other words execution with repne is terminated when ZF = 1
and repe when ZF=0
And of course all of them will be terminated when ecx=0 indefferent from
if at the moment ZF=0 or 1.
It is the picture you can get when stading mnemonics only.
Reality is a little bit different.
1. You can see that there are 3 mnemonics for rep prefixes: rep,repe,repne.
But only 2 opcodes F2,F3.
2. Actually you can use any of the 3 rep mnemonics while working with command
that supposed to be used with only rep prefix.
In this case in all three cases instruction will work the same way.
That means that code:
rep lodsb
repe lodsb
repne lodsb
works the same way: it will repeat instruction lodsb ecx times, no matter
whether prefix F2 or F3 specifyed.

Check it out:
TYPE MNEMONICS right in debugger:
xor eax,eax ;ensure ZF = 0
mov esi,esp ;esi points to stack
mov ecx,10
repe lodsb ;opcode with repe and rep indentical - F3
mov esi,esp
mov ecx,10
repne lodsb
Now run it using F7 not F8.
You can see that both repe and repne work the same way.

Difference with rep(repe) , and repne takes a place
only when chain instruction itself changes flags such as scasb instruction.
In this case LAST BIT of prefix compared against value of ZF.
If they are different - chain execution is terminated.
Right F3 and F2 in binary and you can see what I mean.

Next time we will finish with prefixes and have more close look
on opcode that change EIP.

If you need an extra exrsize: here is a very simple reference utility
on prefixes.
Text is displayed in Russian, but you can see Values of prefixes.
Find reference and replace in the source russian text with English.
Posted on 2002-11-20 04:21:52 by The Svin
I forgot file for last exersize :)
Posted on 2002-11-20 04:45:16 by The Svin
Hi The Svin :)
Nice tutorial :alright:

Slightly off-topic.. I'm very curious about the 5 bytes version of:

Another test for fast and smart lowlevel coder:
if PF=1
set all bits in eax to 1
set all bit in AX to 1 (don't change upper 16 bits in eax if not PF)

5 byte solution.
Who will be the fastest now?

because I, like iblis and bitRAKE couldn't get it below 6 bytes, or 5 bytes assuming CF=1. I couldn't dedicate it too much time, but I was really defeated, and thought it was not possible.. so I'd really like to see if it's possible, and how. :)
Posted on 2002-11-20 07:01:15 by Maverick
Was there 5 byte solution with CF =1 ?
I haven't seen it.
What solution?
Posted on 2002-11-20 07:51:03 by The Svin
Well, my best one (not published, although I got interested to this thread, because it wasn't a valid solution) as well as iblis' one (but he showed the STC at the begin), was:

00000000 7A01 jpe 0x3
00000002 6619C0 sbb ax,ax

Can it be done smaller, and/or without previous CF assumptions? From my reasonings this is not possible.. but I'd like++ to be wrong. :)
Posted on 2002-11-20 09:10:23 by Maverick
Good one :)
Posted on 2002-11-20 11:55:56 by The Svin
Yup, but still terribly insufficient. :grin:

Posted on 2002-11-20 12:46:38 by Maverick
I don't want the solution, I just want to know if there is any point in trying more :)
Posted on 2002-11-21 02:02:55 by micmic
Why not?
Trying more Meveric discovered that there is 5 byte solution
with CF=1.
Posted on 2002-11-21 04:49:36 by The Svin
iblis discovered it too, and posted it before me. ;)
Posted on 2002-11-21 04:55:12 by Maverick
I had figured that challenge was just a simple segue into the prefix tutorials, but it got way out of hand. :grin:
Posted on 2002-11-21 15:19:53 by iblis

Oke I got it.

88C0 is used for moving 8 Bit reg. ( Mov al,al)
89C0 is used for moving 16 Bit reg. (Mov ax,ax)
6689C0 for moving 32 Bit reg. (Mov eax,eax)
6789C0 for moving Memory. (Mov eax,[eax])
6688C0 ...... ??
Posted on 2003-03-04 22:40:17 by realvampire
I know that scientica has carefully read opcodes tuts.
Could you, please, answer realvampire question?
Posted on 2003-03-05 07:09:58 by The Svin
Ok, I'll give it a try. :)

Opcode: Assembly:
88 C0 mov al,al
89 C0 mov (e)ax,(e)ax ; See note1
66 89 C0 mov (e)ax,(e)ax ; See note1
67 89 C0 mov (e)ax,(e)ax ; See note2
66 88 C0 mov al,al ; See note3

If we're in a 16-bit segment then 89 C0 will use word sized registers (mov ax,ax) and 66 89 C0 will then use dword sized registers (mov eax,eax). The 66h is a prefix, that overrides the default operand size (dword or word). In a 16-bit segment 66h will tell the CPU to use dword sized register for then next operand if supported by the instruction (see note2).
But if we're in a 32-bit segment then 89 C0 will use a dword sized registers (mov eax, eax), and then 66 89 C0 will use word sized registers (mov ax, ax).

Summary on note 1: The 66h prefix overrides the defult operand size, in a 16-bit segment 66h will cause the prefixed instruction to use dword sized operands, and in a 32-bit segment 66h will cause the prefixed instruction to use word sized operands. In windows the default operand size if 32-bit for standard PE exes, I think .com files uses 16-bit.
(And the 66h prefix also adds/costs an extra clock cycle at execution, right Svin?)

The 67h prefix is used just as the 66h prefix but 67h overrides memory size instead of operand size.
If you want to execute a {mov eax,} then you must change the mode part of the SIB and change the instuction byte to 8B, thus a dereference of eax will be encoded like this:
8B 00 ; (8B=1000 1011) SIB = 00 000 000

In this case the 66h prefix will be ignored, the only thing that will happen is that the instruction takes a clock cycle longer to execute and the code is one byte larger than needed. Since there is no word or dword sized operand, only byte size, there is nothing for 66h to do. Since 66h overrrides the default operand size, which is dword or word size.
So 66 88 C0 will do the same as 88 C0.

:o I hit the post button to soon when writing my answer, so I've completed the post in "edit mode". (Too used to using the tab key... :\ )
Posted on 2003-03-05 08:30:06 by scientica
Oh, very good!
I'd just add some additinal explonation about
bit "W" in code field that chages operand size
between byte size and "full size".
And also a little bit more precise explonation when
"default operand size" is dword and when word.
Posted on 2003-03-05 10:18:30 by The Svin
:grin: Thanks a lot to you both. Hope you're not feel boring answering my Question.

I've done some experiment.

xor ax,ax
mov es,ax
mov esi,eax

That one is crashes. And not working.

xor al,1
xor ax,ax
mov es,ax
mov esi,eax

This one is working But it Rebooting Windows. So I now understand that even we are use prefix, it wont work unless I set CPU mode to Pmode.

I saw all of your Opcode Tutorial. Good Job.

8B 00 ; (8B=1000 1011) SIB = 00 000 000

Scientica Explain to me each Bit. Especially the 8B. Does B have a Special mean ?

:eek: 0x90 are used for xchg ax,ax and both Nop :confused: My head is hurt...,. Is Xchg ax,ax have any 2 Bytes Code ?
Posted on 2003-03-05 10:32:33 by realvampire
realvampire wrote something like this :)
:grin: Thanks a lot to you both. Hope you're not feel boring answering my Question.

You're welcome! :)
Boring? :confused: No not at all, it's quite satisfying to help others (or at least try). :)
I've done some experiment.

Greate that's how we (or at least I) learn lot's of things. :)

xor ax,ax
mov es,ax
mov esi,eax

That one is crashes. And not working.

Well, it didn't crash for me (I just typed in the instrucitons in olly and ran it, no problems). Don't know why it crashes for you.

xor al,1
xor ax,ax
mov es,ax
mov esi,eax

This one is working But it Rebooting Windows. So I now understand that even we are use prefix, it wont work unless I set CPU mode to Pmode.

Well, LMSW is a privileged instruction. I didn't dare run it as I'm writing the post answer incase it reboots my computer too...
If you want to set the PE bit then I'd suggest you use the cr0 directly instead, since SMSW and LMSW exsist for compabillity with the Intel (80)286.

Scientica Explain to me each Bit. Especially the 8B. Does B have a Special mean ?

:eek: 0x90 are used for xchg ax,ax and both Nop :confused: My head is hurt...,. Is Xchg ax,ax have any 2 Bytes Code ?

Well, 8B is in binary 1000 1011, and the blue bit is the d(irection)-bit that is explained in tutorial number 7 (in this ->Post<-) and the red bit is the w-bit (also explained in the post above).

Well, basically does nothing it's a no-op (eax=eax, very complex instruction and computation ;)). Nop is 90h and does nothing except delaying execution one cycle and add an extra byte to the source. As the Svin mentions in one of his tuts one opcode can have more then one mnenomic, so "nop" and "xchg eax,eax" does exactly the same because they're the same opcode, as we say in sweden "k?rt barn har m?nga namn" (~=loved child has many names), and we may say nop is a well used instruction, it can for instance be used when aligning code sections (if we would align a code section with zeroes we would have problems since 00 00 is "add ,al", to make it easier to under stand I cut out this from an opcode ref: 00 /r is ADD r/m8,r8 ("/r" means that SIB follows and that the "r/m" bits are a reg)).
A little note, as you might know, eax the acumelator register is somewhat special, it has some opcodes for it self, meaning that for instance is 90 +r (the lower three bits of 90h is a 3-bit reg field), so if you want to then you should type 91h. But if you want to do then you would have to type 87 D1 (87 /r is XCHG r32,r/m32).
The accumelator is as you see favorised in opcodes, of cource you can write 87 C0 if you like to but that isn't very apealing (why use a two byte instrucion when you can use a one byte? well maybe if you want to do some SMC (slef modifying code) or write some protection scheme (which may use SMC) since C0 ("start" of some shifting opcodes). Not all instrucitons favor teh accumelator in this way but some do.

Posted on 2003-03-05 14:53:08 by scientica

xor ax,ax
mov es,ax
mov esi,eax
Obviously, that's going to crash. GDT entry 0 is not usable.

xor al,1
xor ax,ax
mov es,ax
mov esi,eax

This one is working But it Rebooting Windows. So I now understand that even we are use prefix, it wont work unless I set CPU mode to Pmode.

But the processor is already running in protected mode. That would put it in real mode and of course it won't do any good since the patches made to MS-DOS are still there, and the code segment is still 32-bit. Besides, the PSP contains protected mode addresses and such. You should always exit Windows using the ExitWindowsEx function instead, so that all drivers will be unloaded correctly and MS-DOS returned to its normal state. But you can always use instruction prefixes, regardless of execution mode.
Posted on 2003-03-05 15:43:53 by Sephiroth3
0x90 are used for xchg ax,ax and both Nop

You said you'd read all opcode tuts.
But the very first tuts your question is details.
There are discussion about relation of
opcode as "real thing" and mnemonics as "names for real things".
And that one real thing
- May have different names.
- May have no name at all.
- And several different things may have the same name.
It is as in real life and human language that is trying to
describe it.
As an exmple there are words about "nop" and "xchg eax,eax"
as two names for the same thing.
For example I can write in my source
svin equ 90h
And now I can write svin instead of nop.
Does it make new thing?
It just create new NAME for old thing
Processor doesn't no names and doesn't care about them.
It understand only real thing - opcode wich in the case is 90h.
90h in binary =
10010 - exchage eax
000 - with reg # 000 (eax)

There is educational programms in this thread in wich you can
see it all in bits hex and mnemonics and train your understanding with
test tabs dlgs.
Posted on 2003-03-05 15:57:28 by The Svin
8B 00 ; (8B=1000 1011) SIB = 00 000 000

scientica, there is no SIB
00 000 000 - is byte moder/m
00 - mod = mem operand in m/r field specified by reg pointer only
000 - reg operand
000 - reg as memory pointer operand.

Byte SIB is extention of m/r field, presence of
byte sib is specifyed if m/r field has value 100.
8B with sib present would look in binary like
1000 10dw :** *** 100 :ss iii bbb
where ** *** any bits except for 11 as the greates two.
and ss iii bbb - bits of sib.
Byte sib is present only when byte modr/m is present too.
Thus the least size of opcode with sib would be 3 bytes:
At least one byte for code block
One byte for modr/m
And one byte for sib.
Your second post for realvampier looks even better than the first.
It's good to have you here :)
Posted on 2003-03-05 16:31:18 by The Svin