I'm writing a restoring division algorithm (for S&G mostly) and have run into a problem. First, here's my code:

My problem is that the 15th line (js DIV1_STEP2B) always jumps to step 2B - maybe I'm using js wrong... I don't know. As a result, the Dividend is always returned as the answer. Obviously I'm don't something wrong, but I can't figure out what it is. Any help would be appreciated.

Spara

```
```

DIV1 proc Dividend:DWORD, Divisor:DWORD

; Remainder register - ebx:eax

; Divisor register - edx

mov eax, Dividend ; Remainder (right)

xor ebx, ebx ; Remainder (left)

mov ecx, 32 ; Counter

mov edx, Divisor ; Divisor

shld ebx, eax, 1 ; Shift the Remainder register left 1 bit

DIV1_STEP1:

sub ebx, edx ; Subtract the Divisor register from the left half of the Remainder

js DIV1_STEP2B ; Test Remainder

; If the Remainder is positive (the subtraction was a success)

shld ebx, eax, 1 ; Shift the Remainder register left 1 bit

or al, 01h ; Set the rightmost bit to 1

jmp DIV1_STEP3

DIV1_STEP2B:

; If the Remainder is negative (the subtraction failed)

add ebx, edx ; Restore the original value to the left half of the Remainder

shld ebx, eax, 1 ; Shift the Remainder register left 1 bit

DIV1_STEP3:

loop DIV1_STEP1 ; Repeat 32 times

shr ebx, 1 ; Shift the right half of the Remainder right 1 bit

ret

DIV1 endp

My problem is that the 15th line (js DIV1_STEP2B) always jumps to step 2B - maybe I'm using js wrong... I don't know. As a result, the Dividend is always returned as the answer. Obviously I'm don't something wrong, but I can't figure out what it is. Any help would be appreciated.

Spara

Some important things to note:

- I don't know what this code is suppose to do.

- SHLD only shifts the first (destination) operand.

- if the algo operates on unsigned data use JC, for signed data use JL.

I believe you want to do this:

shld ebx, eax, 1

shl eax, 1

...rather than just the first instruction, but it is more efficient to use:

add eax, eax

adc ebx, ebx

I thought SHLD worked that way when I first tried to use it, now I don't use it at all.

Okay, I see - you are trying to divide by scaled sutraction.

Sorry, it is late and I was pulling a blank.

- I don't know what this code is suppose to do.

- SHLD only shifts the first (destination) operand.

- if the algo operates on unsigned data use JC, for signed data use JL.

I believe you want to do this:

shld ebx, eax, 1

shl eax, 1

...rather than just the first instruction, but it is more efficient to use:

add eax, eax

adc ebx, ebx

I thought SHLD worked that way when I first tried to use it, now I don't use it at all.

Okay, I see - you are trying to divide by scaled sutraction.

Sorry, it is late and I was pulling a blank.

Yep, do what he said. And you can remove a few instructions by doing this:

It doesn't do anything you can't do with the DIV instruction, but increasing the divisor size should be trivial.

```
DIV1 proc Dividend:DWORD, Divisor:DWORD
```

mov eax,Dividend

xor ebx,ebx

push 32

pop ecx

mov edx,Divisor

beginning:

add eax,eax

adc ebx,ebx

sub ebx,edx

jae one

add ebx,edx

one:

sbb eax,-1

loop beginning

ret

It doesn't do anything you can't do with the DIV instruction, but increasing the divisor size should be trivial.

Ok, I can see how

add eax, eax

adc ebx, ebx

works in place of

shld ebx, eax, 1

shl eax, 1

but is there a simmilar trick for shrd?

@Sephiroth3: Thanks for the hints... I probably would have done that eventualy, but right now I'm trying to follow an algorithm in a book as close as possible. I don't quite understand the "sbb eax, -1" statement, could you explain that?

Spara

EDIT: Replacing the SHLD made the function work perfectly. Thanks.

add eax, eax

adc ebx, ebx

works in place of

shld ebx, eax, 1

shl eax, 1

but is there a simmilar trick for shrd?

@Sephiroth3: Thanks for the hints... I probably would have done that eventualy, but right now I'm trying to follow an algorithm in a book as close as possible. I don't quite understand the "sbb eax, -1" statement, could you explain that?

Spara

EDIT: Replacing the SHLD made the function work perfectly. Thanks.

Ok, I can see how

add eax, eax

adc ebx, ebx

works in place of

shld ebx, eax, 1

shl eax, 1

but is there a simmilar trick for shrd?

rcr ebx, 1

...should do the trick. SHRD/SHLD is only really useful for multi-bit shifts where MMX can't be used. Actually, a branchless MMX/SSE version of this routine would be an interesting challenge (and maybe faster than other methods for numbers in the 64/128-bit range?).

I'm not so sure if parallelism is possible here, since you need the remainder of the high result to computer the low result...

Spara: The sbb eax,-1 will increase eax if the carry flag is clear.

Spara: The sbb eax,-1 will increase eax if the carry flag is clear.

I'm not so sure if parallelism is possible here, since you need the remainder of the high result to computer the low result...

Pardon my ignorance, but how could you write a branceless division algorithm? As far as I know, there are only a few division algorithms: Restoring/Nonrestoring, Booth's, and SRT. All of these use some sort of comparison and decision inside the loop. Can you explain how you would write a branchless version of any of the above? I'd be interested in writing it if I knew what you were talking about.

Spara

Spara

Well, I said it'd be a challenge. ;) MMX allows branchless choices using masks - I was thinking something along those lines. I have other work ATM or I'd code it up. So many ideas - so little time!