ambiguity at Intel software developer's manual, Volume 3A:
System Programming Guide


3.4.5 Segment Descriptors
....
D/B (default operation size/default stack pointer size and/or upper bound) flag
.................................................................................................
Executable code segment. The flag is called the D flag and it
indicates the default length for effective addresses and operands
referenced by instructions in the segment. If the flag is set, 32-bit
addresses and 32-bit or 8-bit operands are assumed; if it is clear,
16-bit addresses and 16-bit or 8-bit operands are assumed.
.................................................................................................

Please write an assembly code to show the use of default operation size. in 32 bits default operation size is 32 or 8, but why it is not 16?!!!


Posted on 2010-05-19 06:15:25 by logicman112

Please write an assembly code to show the use of default operation size. in 32 bits default operation size is 32 or 8, but why it is not 16?!!!


An example is not necessary. Read up on opcode prefixes 0x66 and 0x67, as well as general instruction encodings... they explain everything.
Posted on 2010-05-19 10:05:43 by SpooK
The x86 CPUs have a long legacy. 8-bit operations have their own, a little bit different encoding, while 16/32 bit operations have exactly the same encoding. The default operand size and default address size determine whether a given 16/32 bit encoding is in fact an instruction acting on a 16- or 32-bit address / operand. You may want to use 32-bit operations and 16-bit operations in the same segment / page. In that case you prefix an instruction with either address size prefix or operand size prefix.

Prefixes work by "changing the default". Since the are only 2 possible sizes: the "default" one and the "other" one, you need only 1 prefix for address and 1 for operand.

It goes like this:
1) If the default size (operand/address) is 16-bt and a given instruction doesn't have any prefixes, it's a 16-bit operation.
2) If the default size (operand/address) is 16-bt and a given instruction does have a prefix (operand or address or both), it's an operation acting on a 32-bit operand, address, or both.
3) If the default size (operand/address) is 32-bt and a given instruction doesn't have any prefixes, it's a 32-bit operation.
4) If the default size (operand/address) is 32-bt and a given instruction does have a prefix (operand or address or both), it's an operation acting on a 16-bit operand, address, or both.

A prefix takes 1 byte, so it's clear that if your code is mostly 32-bit and has only few 16-bit operations, you want your default size to be 32-bit and put prefixes only before those few 16-bit operations. And if your code is mostly 16-bit, you do exactly the opposite. Otherwise you're wasting memory for unnecessary prefixes.

Most (all?) modern 32-bit OSes use 32-bit as their default segment/page sizes.

Very similar rules apply to the 64-bit REX prefix.
Posted on 2010-05-19 15:32:50 by ti_mo_n
Moreover, in 16/32-bit code segment, opcode ModR/M byte means something completely different (as SpooK said). D bit in selector and/or opsize/addrsize prefixes essentially select one of similar but not equal opcode sets. Subtle difference exists regarding B bit in ss segment descriptor (it affects implicit sp/esp usage by instructions like push/call/enter).

The whole issue (as I see it) is about backward compatibility. 32-bit extension of 80286 opcode set, presented by 80386, was made in such way so existing 16-bit PM code can be executed hassle-free in 16-bit code segment (D==0). 64-bit extension, in turn, was based on 32-bit opcodes (several encodings was made unavailable, notably single-byte inc/dec r32 which were transformed into various REX prefixes).
Posted on 2010-05-20 15:21:23 by baldr
Thank you very much for the explanations.

I do not want to use prefixes. My question is about the Intel manual. It talks about assumption of 8-bits or 32-bits operands when D flag is set.

Why default operand size can be 8 or 32 ONLY? why not 16?

MOV , AX

How many bits are transferred here? 8 or 16 bits(or 32)? if 16, so Intel manual is wrong?

Besides please write an assembly code that instruction uses a default operand size!
Posted on 2010-05-22 01:47:33 by logicman112
Default operand (and address) size can be either 16 or 32 bit. 8-bit operations have a little different encoding so there is no problem with them.

MOV , AX is a 16-bit operand, 16-bit address instruction.

Since AX is 16-bit, the instruction performs a 16-bit memory write operation.


Its encoding in DOS is: 0x89 0x03
Its encoding in modern Windows is: 0x66 0x89 0x03 (note the prefix)  *

*(Actually, in modern Windows it'll be MOV , AX )
Posted on 2010-05-22 03:04:30 by ti_mo_n

mov , ax


is not the default operand size, if the D bit is set. The operation transfers 16 bits, but will require prefixes, whether you want to use them or not. Since neither the address or operation size is the default (with the D bit set), you'll need one of each - 66h and 67h.

An example of code using the default sizes (with the D bit set):


mov , eax


Why not 16? Because the D bit is set. That's what it does.

Best,
Frank

Posted on 2010-05-22 03:17:59 by fbkotler
hello Frank,

According to what you wrote the following assembly adds prefixes to the instruction format when D bit is set right?:

MOV  , AX

How about :
MOV  , AH

Does assembler add any prefix in this case? Because Intel says, default operand size is 32 or 8.

Another question is that:

Are some prefixes added at run time? because assembler does not know whether D bit is set or not and after program is placed in memory it becomes clear....
Posted on 2010-05-22 04:40:04 by logicman112
MOV  , AH
Does assembler add any prefix in this case? Because Intel says, default operand size is 32 or 8.

In this case, the operand size is 8-bit, and the address size is 32-bit. The address size is held in ebx.

The full form of this instruction is:
MOV , AH
32-bit address, 8-bit operand
This instruction performs an 8-bit write operation to a 32-bit address pointed by selector register DS and GP register EBX.

Are some prefixes added at run time? because assembler does not know whether D bit is set or not and after program is placed in memory it becomes clear....

Nope. When you assemble a piece of code, you explicitly state which platform you want it assembled for. This sets the default values for blocks of code. Additionally, many assemblers allow you to manually play with parameters of any block of your code, but unless you know what you're doing, just stick to the default.
Posted on 2010-05-22 06:11:47 by ti_mo_n
"... but unless you know what you're doing, just stick to the default." Agreed, or if you want to "experiment", perhaps...

Here's some code that is intended merely to be assembled (with Nasm's "-f bin", or equivalent) and examined, not run:


; generate code that expects D-bit to be clear:

bits 16

    mov , ax ; address size: default ; operand size: default

    mov , ah ; address size: default ; operand size: default

    mov , eax ; address size: non-default ; operand size: non-default

; if we expected both parts of this to actually run,
; we would arrange for the D-bit to be set, at this point.
;
; lgdt
; mov eax, cr0
; or al, 1
; mov cr0, eax
; jmp CODE32_DESC: go_32
;
; or so (more to it than this)

; generate code that expects D-bit to be set:

bits 32

go_32:

    mov , eax ; address size: default ; operand size: default

    mov , ah ; address size: default ; operand size: default

    mov , ax ; address size: non-default ; operand size: non-default
;-----------------------


If we want to "see" this as the CPU would, with the D-bit clear, we can disassemble it with ndisasm's 16-bit mode:


00000000  8907              mov ,ax
00000002  8827              mov ,ah
00000004  67668903          mov ,eax
00000008  8903              mov ,ax
0000000A  8823              mov ,ah
0000000C  67668907          mov ,eax


Notice that the last three lines are wildly incorrect! Now we disassemble it with ndisasm's "-b32" mode, to "see" this as the CPU would in 32-bit mode (D-bit set):


00000000  8907              mov ,eax
00000002  8827              mov ,ah
00000004  67668903          mov ,ax
00000008  8903              mov ,eax
0000000A  8823              mov ,ah
0000000C  67668907          mov ,ax


Now the first three lines are wrong, and the last three are correct. (ndisasm has  no convenient way to switch bitness in mid-disassemble) :(

That at least shows where the prefixes occur. A fully "working" example would be possible, but not the kind of thing you can do in either Windows or Linux... the D-bit is set, and we aren't allowed to touch it... Would require a lot of rebooting to test it, and until it's right, it'll reboot on its own a lot, too! :)

Such an example would probably include a sequence like:


mov ax, CODE32_DESC
mov ds, ax
mov es, ax
mov ss, ax
...


This is interesting(?), since a "mov" with a segment register as destination does not require a prefix, even with the D-bit set - it is inherently a 16-bit instruction - perhaps an exception to what you're reading... (not so for a segreg as src!)

Best,
Frank

Posted on 2010-05-22 21:53:33 by fbkotler
Thank you so much Frank for a good detailed and practical explanation. 

Do you think adding prefix is only for memory operands? If you have a word register MOV in 32 bits and D=1 (MOV ax, bx), does assembler add prefix?

You said some instructions are not correct in your examples like:
mov ,eax

Why? they seem OK.

Posted on 2010-05-22 22:44:35 by logicman112
Yeah, but we said ""!!!

The assembler would add a prefix for "mov ax, bx", yeah. Perhaps "more important" for "mov ax, 1" vs "mov eax, 1" - same opcode (except for the prefix), but the first uses 2 bytes to store the "1", and the second uses 4 bytes. If the assembler were to "get it wrong" and not put a prefix on the "mov ax/eax, imm16/32" opcode, but store the "1" as two bytes (as would happen if we told the assembler - or it defaulted to - 16 bits), the CPU, expecting the default operand size, would "gobble up" an extra two bytes - presumably intended to be the next opcode and ??? - and attempt to execute whatever comes next as an opcode. Results uncertain, but not as intended.

This example would only involve the operand size override prefix (66h), since there is no address involved.

Haven't you got an assembler so you can try this stuff and see what it does? They're not expensive. :)

Best,
Frank

Posted on 2010-05-22 23:34:28 by fbkotler
Thank you for the reply. I want to use 'as' assembler under linux. I have not yet started using it.

if we have the following code(D=1) while assembler does not add prefix 66H:
MOV  AX,

How many bytes are taken from memory(1 or 4 )? Because the default operand sizes in 32 bits are 8 or 32 bits. If it takes 1 byte, it can zero extend to have a 16 bit, if it takes 32 bits, only low word can be copied to AX!!

What is the default size of an immediate in 32 bit code segments? Are they like other operands? (8 or 32)
Posted on 2010-05-23 02:03:26 by logicman112
What are you waiting for? Afraid you'll wear it out? :)

t.s

.global _start

.text
_start:

movw (%ebx), %ax

movw $1, %ax
movl $1, %eax

movl $1, %eax
int $0x80


as -o t.o t.s
objdump -d t.o



t.o:    file format elf32-i386

Disassembly of section .text:

00000000 <_start>:
  0: 66 8b 03            mov    (%ebx),%ax
  3: 66 b8 01 00          mov    $0x1,%ax
  7: b8 01 00 00 00      mov    $0x1,%eax
  c: b8 01 00 00 00      mov    $0x1,%eax
  11: cd 80                int    $0x80


Try it, you'll like it! :)

Best,
Frank

Posted on 2010-05-23 03:19:09 by fbkotler
Thanks a lot for the sample code. I've got the answer now.  8)

Posted on 2010-05-23 04:15:27 by logicman112