How does the div opcode work

I mean how does it do division? I've been curious about this...

I decided one day I'd try to "fake" division -- but I'm still in the dark (I was trying to do it for floating point, but I'm just curious about integer divion right now)...

I know you can divide by bit shifting... Is there a more complete ways?

I guess I'm asking because "div" on a .486 takes about 40 clocks to complete so how does it use them?

Sliver
Posted on 2002-01-20 15:25:48 by Sliver
mov eax, 12
mov ecx, 3
div ecx

eax is the quotient
ecx is the divisor

the answer will be in eax, if there is a remainder it will be in edx
Posted on 2002-01-20 15:50:45 by stryker
ok you failed to read the original post

this isn't a "help me I don't no how to divide" :)

I want to know HOW DIV works!

ie.

how does it divivde 123 / 45, etc

Sliver
Posted on 2002-01-20 16:03:19 by Sliver
If I remember well, it is explained in the Intel docs... (with a shema... you know... where they always use those "<-" characters everywhere ^^)

Maybe the best explaination is in a math book, though... :rolleyes:
Posted on 2002-01-20 16:19:00 by JCP
Oops!!! he! he! he! :) I don't know but my guess is it does more than using DIV, probably it does an ADD and SUB instruction...using MUL is adding itself how many times(in short MUL is just plain ADD).

E.G.

(1 x 3 = 3) == (1 + 1 + 1 == 3)

So DIV might be a combination of that.
Posted on 2002-01-20 16:23:09 by stryker
Oh I know...Just a guess

12/4 == 3 remainder 0

is the same as 12 - 3 - 3 - 3 - 3, maybe it counts how many times it subtracted by 3 then if it is less than 3 then the remaining value is the remainder.

15/2 == 7.5(using a calculator)

15 - 2 = 13
13 - 2 = 11
11 - 2 = 9
9 - 2 = 7
7 - 2 = 5
5 - 2 = 3
3 - 2 = 1

since 1 < 2. Counting how many times we subtracted:: 7 times
remainder 1... there you go...

eax == 7
edx == 1

which is correct. when using it this way:

mov eax, 15
mov ecx, 2
div ecx

of course, it will not output 7.5 because DIV isn't a reliable instruction :)

to correct that, save eax(7) then mov eax, edx

StartOfLoop:

eax now contains 1
multiply by 10 == 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1
which makes eax == 10

10 - 2 = 8
5 - 2 = 6
6 - 2 = 4
4 - 2 = 2
2 - 2 = 0

counting how many times we subtracted:: 5 times
check if it last values is == 0. if it's not save the value of eax then do the loop again until the last value is 0.

pop those values.

concatenating both makes it 7.5

check out DivToAscii procedure(Just Asking:: part 2 thread)
http://www.asmcommunity.net/board/index.php?topic=3039
Posted on 2002-01-20 16:32:06 by stryker
umberg6007 that's very interesting... now I just have t figure out how they do it for floating point

Readiosys do you know where in the docs it can be found? It'd help me out alot in my learning this

Sliver

any otherr ideas on integer or floating point division would be appreciated
Posted on 2002-01-20 17:21:26 by Sliver
Oops!!! i forgot to answer 123/45

using this code:
mov eax, 123
mov ecx, 45
div ecx

I'm sure eax == 2 and edx == 33

123 - 45 == 78 ... count == 1

is the last value < ecx(45) if it is, count will be our quotient(eax) and the last remaining value will be 78(edx). if its not. continue...

78 - 45 == 33 ... count == 2

is the last value < ecx(45) if it is, count will be our quotient(eax) and the last remaining value will be 33(edx). if its not. continue...

since 33 < 45. eax == 2, edx == 33, which is correct in the sense of using the code above. But its isn't the correct answer, which is
2.7333333333333333333...

...same step as above but treat this part as your remainder

to correct that save the value of eax which is 2
mov eax, edx ;eax == 33
multiply eax by 10 == 33 + 33 + 33 + 33 + 33 + 33 + 33 + 33 + 33 + 33
which makes eax == 330

330 - 45 == 285
...

In the long run you'll eventually get 7 and a non terminating 3.

:)
Posted on 2002-01-20 17:28:08 by stryker
Here's a code snippet to do it. In the old days, for certain values,
a software approach could be faster than using div... probably there
exists methods that are a lot better than my little five-minute hack ;)



; divide EAX by EBX. Result in EAX, modulus in EDX.
xor edx, edx
dec edx
@@divloop:
inc edx
sub eax, ebx
jns @@divloop
add eax, ebx
xchg eax, edx ; yeah yeah this is slow
Posted on 2002-01-20 17:32:30 by f0dder
As for floating-point, things get pretty messy. I think NASM has a
complete set of IEEE (or whatever) compliant software floating-point
routines... check it out.
Posted on 2002-01-20 17:33:11 by f0dder
This is the file that'll solve the problem you can control how many digits for the floating point...please read the disclaimer at the end of the text file. I'm still working on a new one that will output correctly... So basically DIV is just SUB and MUL is just ADD. Hey you can still use fdiv...fdivp :)

This one has still some bugs.
Posted on 2002-01-20 17:38:01 by stryker
This is an efficient method when in optimized asm:
static int try_sub(int * a, int * b) {

char ok;

__asm__ __volatile__("movl (%1),%%eax ; subl %%eax,(%2)\n\t"
"movl 4(%1),%%eax ; sbbl %%eax,4(%2)\n\t"
"movl 8(%1),%%eax ; sbbl %%eax,8(%2)\n\t"
"movl 12(%1),%%eax ; sbbl %%eax,12(%2)\n\t"
"setae %%al":"=a" (ok):"c" (a),"d" (b));
return ok;
}

static void div64(int * a, int * b, int * c) {
int tmp[4];
int i;
unsigned int mask = 0;

c += 4;
for (i = 0 ; i<64 ; i++) {
if (!(mask >>= 1)) {
c--;
mask = 0x80000000UL;
}
tmp[0] = a[0]; tmp[1] = a[1];
tmp[2] = a[2]; tmp[3] = a[3];
if (try_sub(b,tmp)) {
*c |= mask;
a[0] = tmp[0]; a[1] = tmp[1];
a[2] = tmp[2]; a[3] = tmp[3];
}
shift_right(b);
}
}
...and here is a detailed example:
; div64: Divide two int64_ts, returning either the quotient or the remainder


; This is the only complicated routine. The basic strategy is to do long
; division. I take the most significant bit of the dividend (the "partial
; dividend") and compare it to the divisor. If it's greater, the most
; significant bit in the quotient is 1 and the divisor is subtracted from the
; partial dividend. If not, the bit is 0. The partial dividend is shifted left
; 1 bit, joined by the next most significant bit from the dividend. The process
; is repeated until I run out of dividend bits. Whatever's left is the remainder.

; This process is optimized a bit in the following ways. I can't possibly get
; any 1 bits in the quotient until I've shifted in enough of the dividend so
; that it most significant bit is in the same bit position as the most
; significant bit of the divisor. The number of bits this will take is easily
; determined using the bsr opcode. While I'm doing this, it's easy to check if
; the divisor is zero (I return 0) or 1 (I return the dividend as the quotient
; or 0 as the remainder) or if the dividend is 0 (I return 0) or if the divisor
; is bigger than the dividend (I return 0 for the quotient or the dividend as
; the remainder). Once I've shifted the dividend the number of bits I
; determined, I can proceed to loop through the rest.

; This algorithm loops a maximum of 63 times. This is expensive if the dividend is
; less than 64 times the divisor, but quite efficient otherwise. Since the point
; of this is to deal with extremely large numbers, it should be worth it.

; This routine returns the quotient or remainder, depending on the value of the
; first parameter when it's called. It assumes that it was called like this:
;
; push WANT_QUOTIENT or WANT_REMAINDER
; call div64
;
; Therefore, the dividend and divisor and the place to return the result must
; already be on the stack (from the call by the C program.) This means that the
; original caller's return address and the immediate caller's ebp are still on
; the stack and should be ignored.

div64 proc syscall private result:dword, ignore:qword, outint:pint64, in1:int64_t, in2:int64_t

; MASM 6.0 adds the following due to the PROC statement
; push ebp
; mov ebp, esp

; To divide two 64-bit numbers, any negative operands must be made
; postive first. If an odd number are made postive, the result must
; be made negative
lea eax, in1 ; get address of first int64_t
push eax
call pre
add esp, 4
push esi

; Initialize the quotient to 0 and the remainder to 0
mov quot.lowint, 0
mov quot.highint, 0


; Determine how many bits to shift initially
bsr eax, in2.highint ; find the most sig 1 bit in divisor
jz checklow ; if none, check the low word
add eax, 32 ; if found, it's in the high word
jmp dividend ; continue
checklow:
bsr eax, in2.lowint ; find the most sig 1 bit in divisor
jz div_done ; if none then we're dividing by 0
cmp eax, 0 ; is the divisor 1?
jne dividend ; if not, continue
mov ebx, in1.highint ; if yes, we're dividing by 1 so
mov quot.highint, ebx ; ... the quotient ...
mov ebx, in1.lowint ; ....... is ...
mov quot.lowint, ebx ; ........ the dividend
jmp div_done
dividend:
bsr ebx, in1.highint ; find the most sig 1 bit in dividend
jz checklow2 ; if none, check the low word
add ebx, 32 ; if found, it's in the high word
jmp divide ; continue
checklow2:
bsr ebx, in1.lowint ; find the most sig 1 bit in dividend
jz div_done ; if none then we're dividing 0

; We know know where the first 1 bit is in the divisor (eax) and the dividend
; (ebx). eax is at least 1 (if it's 0 then the divisor is 1). If eax is greater
; than ebx then the divisor is greater than the dividend. If they're equal then
; the most significant bits are already lined up. Otherwise, the number of
; bits we can shift initially is 64 - ebx (to get to the first 1 bit in the
; dividend) + eax (to get that bit over to where the first 1 bit is in the
; divisor). The first bit of the quotient will be ebx - eax. This is also the
; number of times to go thru the subtraction loop - 1.
divide:
sub ebx, eax ; is the divisor greater than the dividend?
; also the first quotient bit
mov eax, in1.lowint ; [ get dividend ...
mov edx, in1.highint ; ... in eax, edx]
jae shift ; if not, continue
mov esi, eax ; if yes, the remainder ...
mov edi, edx ; ... is the dividend
jmp div_done

; We now need to shift the 64-bit dividend into a 64-bit partial dividend. Use
; eax and edx for the dividend and esi and edi for the partial dividend.
; Calculate the number of bits to shift in ecx. Decrement since one bit will be
; shifted at the beginning of the suntract loop. I may need to shift more than
; 31 bits, which shld won't do. So if ecx > 31, subtract 32 and manually shift
; 32 bits. At the end, esi and edi will have the remainder.
shift:
mov esi, 0 ; to hold the current ...
mov edi, 0 ; ... partial dividend
mov ecx, 63 ; get the number of bits ...
sub ecx, ebx ; ... to shift
cmp ecx, 31 ; too many for shld?
jbe shift2 ; if not, continue
sub ecx, 32 ; 32 bits ...
mov esi, edx ; ... will be ...
mov edx, eax ; ...... shifted
mov eax, 0 ; ...........manually
shift2:
shld edi, esi, cl ; shift ...
shld esi, edx, cl ; ...... the ...
shld edx, eax, cl ; .......... dividend
shl eax, cl

; Now loop ebx times,
subtract:
mov ecx, ebx ; get number of times to loop
inc ecx ; turn a bit number into a number of bits
shift_bit:
shld edi, esi, 1 ; shift ...
shld esi, edx, 1 ; ...... the ...
shld edx, eax, 1 ; .......... dividend
shl eax, 1
cmp edi, in2.highint ; is the partial dividend >= divisor?
ja subtract2 ; if yes, continue
jb next_bit ; if not, quotient bit is 0
cmp esi, in2.lowint ; is the partial dividend >= divisor?
jb next_bit ; if not, quotient bit is 0
subtract2:
sub edi, in2.highint ; if yes, ...
sbb esi, in2.lowint ; ... subtract the divisor
push ecx ; remember the count
dec ecx ; back to bit position
cmp ecx, 31 ; bit position greater than 31?
jbe quotient ; if no, continue
sub ecx, 32 ; can only shift up to 31 bits
quotient:
mov ebx, 1 ; bit position 0
shl ebx, cl ; rotate the bit into the right
; position for the quotient
pop ecx ; get count
cmp ecx, 32 ; bit position greater than 31?
ja quotient_high ; if yes, use high word
or quot.lowint, ebx ; add bit to quotient low word
jmp next_bit ; continue
quotient_high:
or quot.highint, ebx ; add bit to quotient high word
next_bit:
loop shift_bit

; VisualAge C++ wants returned structures to be placed in memory pointed
; to by the implicit first parameter
div_done:
cmp result, WANT_REMAINDER ; caller wants remainder?
je put_result ; if yes, continue
mov esi, quot.lowint ; get ...
mov edi, quot.highint ; ... quotient
jmp put_result
put_result:
mov ebx, outint
mov (int64_t ptr [ebx]).lowint, esi
mov (int64_t ptr [ebx]).highint, edi

; I have to change the sign of the quotient if only one of the operands was
; negative. I have to change the sign of the remainder if the dividend was
; negative.
pop esi ; get the saved number of neg inputs
push ebx ; parameter for post
cmp result, WANT_REMAINDER ; caller wants remainder?
je remainder ; if yes, check dividend's sign
test esi, 1 ; need to change quotient's sign?
jz div_ret ; if not, all done
call post ; change the sign
jmp div_ret ; all done
remainder:
test esi, 02h ; dividend negative?
jz div_ret ; if not, all done
or esi, 1 ; indicate to change the sign
call post ; change the sign
div_ret:
add esp, 2 ; correct for post parameter
ret

; MASM 6.0 replaces RET statement with the following
; leave
; ret 00000h
div64 endp
Credit goes to:
Bob Pesner

PC Dialogs Inc.
305 w. 98 St., #5BN
New York, NY 10025
212-663-3459
[email]bpesner@panix.com[/email]
( Full Post )
Division is slow - has always been slow. Multiply when you can!
Posted on 2002-01-20 18:28:08 by bitRAKE