How does the div opcode work

I decided one day I'd try to "fake" division -- but I'm still in the dark (I was trying to do it for floating point, but I'm just curious about integer divion right now)...

I know you can divide by bit shifting... Is there a more complete ways?

I guess I'm asking because "div" on a .486 takes about 40 clocks to complete so how does it use them?

Sliver
Posted on 2002-01-20 15:25:48 by Sliver
mov eax, 12
mov ecx, 3
div ecx

eax is the quotient
ecx is the divisor

the answer will be in eax, if there is a remainder it will be in edx
Posted on 2002-01-20 15:50:45 by stryker
ok you failed to read the original post

this isn't a "help me I don't no how to divide" :)

I want to know HOW DIV works!

ie.

how does it divivde 123 / 45, etc

Sliver
Posted on 2002-01-20 16:03:19 by Sliver
If I remember well, it is explained in the Intel docs... (with a shema... you know... where they always use those "<-" characters everywhere ^^)

Maybe the best explaination is in a math book, though... :rolleyes:
Posted on 2002-01-20 16:19:00 by JCP
Oops!!! he! he! he! :) I don't know but my guess is it does more than using DIV, probably it does an ADD and SUB instruction...using MUL is adding itself how many times(in short MUL is just plain ADD).

E.G.

(1 x 3 = 3) == (1 + 1 + 1 == 3)

So DIV might be a combination of that.
Posted on 2002-01-20 16:23:09 by stryker
Oh I know...Just a guess

12/4 == 3 remainder 0

is the same as 12 - 3 - 3 - 3 - 3, maybe it counts how many times it subtracted by 3 then if it is less than 3 then the remaining value is the remainder.

15/2 == 7.5(using a calculator)

15 - 2 = 13
13 - 2 = 11
11 - 2 = 9
9 - 2 = 7
7 - 2 = 5
5 - 2 = 3
3 - 2 = 1

since 1 < 2. Counting how many times we subtracted:: 7 times
remainder 1... there you go...

eax == 7
edx == 1

which is correct. when using it this way:

mov eax, 15
mov ecx, 2
div ecx

of course, it will not output 7.5 because DIV isn't a reliable instruction :)

to correct that, save eax(7) then mov eax, edx

StartOfLoop:

eax now contains 1
multiply by 10 == 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1
which makes eax == 10

10 - 2 = 8
5 - 2 = 6
6 - 2 = 4
4 - 2 = 2
2 - 2 = 0

counting how many times we subtracted:: 5 times
check if it last values is == 0. if it's not save the value of eax then do the loop again until the last value is 0.

pop those values.

concatenating both makes it 7.5

http://www.asmcommunity.net/board/index.php?topic=3039
Posted on 2002-01-20 16:32:06 by stryker
umberg6007 that's very interesting... now I just have t figure out how they do it for floating point

Readiosys do you know where in the docs it can be found? It'd help me out alot in my learning this

Sliver

any otherr ideas on integer or floating point division would be appreciated
Posted on 2002-01-20 17:21:26 by Sliver
Oops!!! i forgot to answer 123/45

using this code:
mov eax, 123
mov ecx, 45
div ecx

I'm sure eax == 2 and edx == 33

123 - 45 == 78 ... count == 1

is the last value < ecx(45) if it is, count will be our quotient(eax) and the last remaining value will be 78(edx). if its not. continue...

78 - 45 == 33 ... count == 2

is the last value < ecx(45) if it is, count will be our quotient(eax) and the last remaining value will be 33(edx). if its not. continue...

since 33 < 45. eax == 2, edx == 33, which is correct in the sense of using the code above. But its isn't the correct answer, which is
2.7333333333333333333...

...same step as above but treat this part as your remainder

to correct that save the value of eax which is 2
mov eax, edx ;eax == 33
multiply eax by 10 == 33 + 33 + 33 + 33 + 33 + 33 + 33 + 33 + 33 + 33
which makes eax == 330

330 - 45 == 285
...

In the long run you'll eventually get 7 and a non terminating 3.

:)
Posted on 2002-01-20 17:28:08 by stryker
Here's a code snippet to do it. In the old days, for certain values,
a software approach could be faster than using div... probably there
exists methods that are a lot better than my little five-minute hack ;)

``````
; divide EAX by EBX. Result in EAX, modulus in EDX.
xor  edx, edx
dec  edx
@@divloop:
inc  edx
sub  eax, ebx
jns  @@divloop
xchg  eax, edx ; yeah yeah this is slow
``````
Posted on 2002-01-20 17:32:30 by f0dder
As for floating-point, things get pretty messy. I think NASM has a
complete set of IEEE (or whatever) compliant software floating-point
routines... check it out.
Posted on 2002-01-20 17:33:11 by f0dder
This is the file that'll solve the problem you can control how many digits for the floating point...please read the disclaimer at the end of the text file. I'm still working on a new one that will output correctly... So basically DIV is just SUB and MUL is just ADD. Hey you can still use fdiv...fdivp :)

This one has still some bugs.
Posted on 2002-01-20 17:38:01 by stryker
This is an efficient method when in optimized asm:
``````static int try_sub(int * a, int * b) {
char ok;

__asm__ __volatile__("movl (%1),%%eax ; subl %%eax,(%2)\n\t"
"movl 4(%1),%%eax ; sbbl %%eax,4(%2)\n\t"
"movl 8(%1),%%eax ; sbbl %%eax,8(%2)\n\t"
"movl 12(%1),%%eax ; sbbl %%eax,12(%2)\n\t"
"setae %%al":"=a" (ok):"c" (a),"d" (b));
return ok;
}

static void div64(int * a, int * b, int * c) {
int tmp[4];
int i;

c += 4;
for (i = 0 ; i<64 ; i++) {
c--;
}
tmp[0] = a[0]; tmp[1] = a[1];
tmp[2] = a[2]; tmp[3] = a[3];
if (try_sub(b,tmp)) {
a[0] = tmp[0]; a[1] = tmp[1];
a[2] = tmp[2]; a[3] = tmp[3];
}
shift_right(b);
}
}``````
...and here is a detailed example:
``````; div64: Divide two int64_ts, returning either the quotient or the remainder

; This is the only complicated routine. The basic strategy is to do long
;  division. I take the most significant bit of the dividend (the "partial
;  dividend") and compare it to the divisor. If it's greater, the most
;  significant bit in the quotient is 1 and the divisor is subtracted from the
;  partial dividend. If not, the bit is 0. The partial dividend is shifted left
;  1 bit, joined by the next most significant bit from the dividend. The process
;  is repeated until I run out of dividend bits. Whatever's left is the remainder.

; This process is optimized a bit in the following ways. I can't possibly get
;  any 1 bits in the quotient until I've shifted in enough of the dividend so
;  that it most significant bit is in the same bit position as the most
;  significant bit of the divisor. The number of bits this will take is easily
;  determined using the bsr opcode. While I'm doing this, it's easy to check if
;  the divisor is zero (I return 0) or 1 (I return the dividend as the quotient
;  or 0 as the remainder) or if the dividend is 0 (I return 0) or if the divisor
;  is bigger than the dividend (I return 0 for the quotient or the dividend as
;  the remainder). Once I've shifted the dividend the number of bits I
;  determined, I can proceed to loop through the rest.

; This algorithm loops a maximum of 63 times. This is expensive if the dividend is
;  less than 64 times the divisor, but quite efficient otherwise. Since the point
;  of this is to deal with extremely large numbers, it should be worth it.

; This routine returns the quotient or remainder, depending on the value of the
;  first parameter when it's called. It assumes that it was called like this:
;
;                         push  WANT_QUOTIENT or WANT_REMAINDER
;                         call  div64
;
;  Therefore, the dividend and divisor and the place to return the result must
;  already be on the stack (from the call by the C program.) This means that the
;  original caller's return address and the immediate caller's ebp are still on
;  the stack and should be ignored.

div64   proc    syscall private result:dword, ignore:qword, outint:pint64, in1:int64_t, in2:int64_t

; MASM 6.0 adds the following due to the PROC statement
;       push    ebp
;       mov     ebp, esp

; To divide two 64-bit numbers, any negative operands must be made
;  postive first. If an odd number are made postive, the result must
lea     eax, in1                    ; get address of first int64_t
push    eax
call    pre
push    esi

; Initialize the quotient to 0 and the remainder to 0
mov     quot.lowint, 0
mov     quot.highint, 0

; Determine how many bits to shift initially
bsr     eax, in2.highint            ; find the most sig 1 bit in divisor
jz      checklow                    ; if none, check the low word
add     eax, 32                     ; if found, it's in the high word
jmp     dividend                    ; continue
checklow:
bsr     eax, in2.lowint             ; find the most sig 1 bit in divisor
jz      div_done                    ; if none then we're dividing by 0
cmp     eax, 0                      ; is the divisor 1?
jne     dividend                    ; if not, continue
mov     ebx, in1.highint            ; if yes, we're dividing by 1 so
mov     quot.highint, ebx           ; ... the quotient ...
mov     ebx, in1.lowint             ; ....... is ...
mov     quot.lowint, ebx            ; ........ the dividend
jmp     div_done
dividend:
bsr     ebx, in1.highint            ; find the most sig 1 bit in dividend
jz      checklow2                   ; if none, check the low word
add     ebx, 32                     ; if found, it's in the high word
jmp     divide                      ; continue
checklow2:
bsr     ebx, in1.lowint             ; find the most sig 1 bit in dividend
jz      div_done                    ; if none then we're dividing 0

; We know know where the first 1 bit is in the divisor (eax) and the dividend
;  (ebx). eax is at least 1 (if it's 0 then the divisor is 1). If eax is greater
;  than ebx then the divisor is greater than the dividend. If they're equal then
;  the most significant bits are already lined up. Otherwise, the number of
;  bits we can shift initially is 64 - ebx (to get to the first 1 bit in the
;  dividend) + eax (to get that bit over to where the first 1 bit is in the
;  divisor). The first bit of the quotient will be ebx - eax. This is also the
;  number of times to go thru the subtraction loop - 1.
divide:
sub     ebx, eax                    ; is the divisor greater than the dividend?
;  also the first quotient bit
mov     eax, in1.lowint             ; [ get dividend ...
mov     edx, in1.highint            ; ... in eax, edx]
jae     shift                       ; if not, continue
mov     esi, eax                    ; if yes, the remainder ...
mov     edi, edx                    ; ... is the dividend
jmp     div_done

; We now need to shift the 64-bit dividend into a 64-bit partial dividend.  Use
;  eax and edx for the dividend and esi and edi for the partial dividend.
;  Calculate the number of bits to shift in ecx. Decrement since one bit will be
;  shifted at the beginning of the suntract loop. I may need to shift more than
;  31 bits, which shld won't do. So if ecx > 31, subtract 32 and manually shift
;  32 bits. At the end, esi and edi will have the remainder.
shift:
mov     esi, 0                      ; to hold the current ...
mov     edi, 0                      ; ... partial dividend
mov     ecx, 63                     ; get the number of bits ...
sub     ecx, ebx                    ; ... to shift
cmp     ecx, 31                     ; too many for shld?
jbe     shift2                      ; if not, continue
sub     ecx, 32                     ; 32 bits ...
mov     esi, edx                    ; ... will be ...
mov     edx, eax                    ; ...... shifted
mov     eax, 0                      ; ...........manually
shift2:
shld    edi, esi, cl                ; shift ...
shld    esi, edx, cl                ; ...... the ...
shld    edx, eax, cl                ; .......... dividend
shl     eax, cl

; Now loop ebx times,
subtract:
mov     ecx, ebx                    ; get number of times to loop
inc     ecx                         ; turn a bit number into a number of bits
shift_bit:
shld    edi, esi, 1                 ; shift ...
shld    esi, edx, 1                 ; ...... the ...
shld    edx, eax, 1                 ; .......... dividend
shl     eax, 1
cmp     edi, in2.highint            ; is the partial dividend >= divisor?
ja      subtract2                   ; if yes, continue
jb      next_bit                    ; if not, quotient bit is 0
cmp     esi, in2.lowint             ; is the partial dividend >= divisor?
jb      next_bit                    ; if not, quotient bit is 0
subtract2:
sub     edi, in2.highint            ; if yes, ...
sbb     esi, in2.lowint             ; ... subtract the divisor
push    ecx                         ; remember the count
dec     ecx                         ; back to bit position
cmp     ecx, 31                     ; bit position greater than 31?
jbe     quotient                    ; if no, continue
sub     ecx, 32                     ; can only shift up to 31 bits
quotient:
mov     ebx, 1                      ; bit position 0
shl     ebx, cl                     ; rotate the bit into the right
;  position for the quotient
pop     ecx                         ; get count
cmp     ecx, 32                     ; bit position greater than 31?
ja      quotient_high               ; if yes, use high word
or      quot.lowint, ebx            ; add bit to quotient low word
jmp     next_bit                    ; continue
quotient_high:
or      quot.highint, ebx           ; add bit to quotient high word
next_bit:
loop    shift_bit

; VisualAge C++ wants returned structures to be placed in memory pointed
;  to by the implicit first parameter
div_done:
cmp     result, WANT_REMAINDER      ; caller wants remainder?
je      put_result                  ; if yes, continue
mov     esi, quot.lowint            ; get ...
mov     edi, quot.highint           ; ... quotient
jmp     put_result
put_result:
mov     ebx, outint
mov     (int64_t ptr [ebx]).lowint, esi
mov     (int64_t ptr [ebx]).highint, edi

; I have to change the sign of the quotient if only one of the operands was
;  negative. I have to change the sign of the remainder if the dividend was
;  negative.
pop     esi                         ; get the saved number of neg inputs
push    ebx                         ; parameter for post
cmp     result, WANT_REMAINDER      ; caller wants remainder?
je      remainder                   ; if yes, check dividend's sign
test    esi, 1                      ; need to change quotient's sign?
jz      div_ret                     ; if not, all done
call    post                        ; change the sign
jmp     div_ret                     ; all done
remainder:
test    esi, 02h                    ; dividend negative?
jz      div_ret                     ; if not, all done
or      esi, 1                      ; indicate to change the sign
call    post                        ; change the sign
div_ret:
add     esp, 2                      ; correct for post parameter
ret

; MASM 6.0 replaces RET statement with the following
;       leave
;       ret    00000h
div64   endp``````
Credit goes to:
``````Bob Pesner
PC Dialogs Inc.
305 w. 98 St., #5BN
New York, NY 10025
212-663-3459
[email]bpesner@panix.com[/email]``````
( Full Post )
Division is slow - has always been slow. Multiply when you can!
Posted on 2002-01-20 18:28:08 by bitRAKE