org 100h

mov ah,9
mov dx,hello
int 21h
int 20h

hello db 'Hello world!',24h

I assemble the above source to a com file using fasm(and others) and get the hex numbers as follows:
B409 BA09 01CD 21CD 2048 656C 6C6F 2077 6F72
6C6F 2124
I know B409 is mov ah,9
BA09 01 is mov dx,hello
CD 21CD 20 is int 21h;int 20h
and the left hex numbers are the string with the end 24h

But I do not know how the complier make mov dx,hello into BA09 01
In the machine code structure
(Prefixes Code ModR/M SIB displacement Immediate),
could anyone give me a detailed explanation on how the mov dx,hello are converted into BA09 01? In the Intel Manual,is it a mov r16,imm16 style?If not,which one is it?

If possible,please write to me directly through
Posted on 2004-10-03 21:57:56 by helloxuyihua
If you really whant to know:

Assemblers and Loaders By David Salomon. First pointed by numitor, thx.

Linkers and Loaders

Only other sugestion, you can read the fasm codes, or the nasm codes, search specially for a function (or label) called assemble or some like that, then analyze from then search the funtions and variables (or blocks of memory).
Posted on 2004-10-03 22:17:13 by rea
The offset of the string is 109h (because of the org 100h)

Under 16bit, The opcode for immediate to reg is

1011 w reg followed by immediate which is word size

w is 1 since you are using word.
reg is 010 as you are using dx

ax - 000b
cx - 001b
dx - 010b
bx - 011b
and so on

So you get

1011 1010b followed by 109h
which becomes BA 09 01. Simple as that. Refer to the intel manual for more information, or search for opcode under the algo section of the forum for more data on encoding for 32bit.
Posted on 2004-10-04 00:21:26 by roticv