Everyone knows how expensive division instructions can be, especially on the x86. I found a neat article (though, fairly old) that has some good information on using alternate methods. These alternate methods are nothing new, and are not alien to me... but the biggest thing about that link, is that it provides some 32-bit constants for use in alternate methods... a handy reference.

Enjoy ;)
Posted on 2006-11-24 11:11:47 by SpooK
here is info from FASM board on this subject:

http://board.flatassembler.net/topic.php?t=5881
Posted on 2006-11-24 11:16:21 by vid
Agner Fog has explained the multiplication by reciprocal in his "Optimizing subroutines in Assembly language" manual rather completely at the "Integer division by a constant (all processors)" section. By far it's the fastest way I have found to for example divide a number by 10.
Posted on 2006-11-24 11:50:45 by XCHG
"Magic Divider" by The Svin does the trick for every division by constant.
Posted on 2006-11-24 12:29:29 by Ultrano
Thanks for the info :)

I've skimmed through Agner's manuals to grasp general ideas before, but there is so much information that it is hard to remember what all is in there :P
Posted on 2006-11-24 13:29:06 by SpooK
To the Ineffable All,

I call the above methods, "half division" algos.  That's because they don't give you a remainder.  Of course you can get the remainder by subtracting from the dividend, the product of the quotient and divisor. But do you save any time doing that?  :shock: Ratch
Posted on 2006-11-24 14:06:33 by Ratch
According to the link "Chapter 10: Integer Division by Constants" within the page I linked above, there are statistics for a particular example, obviously in favor of the method described. This method includes the remainder, such as you described.

I guess the best way to figure that out exactly, is to test and compare the said method to a standard DIV instruction ourselves :)
Posted on 2006-11-24 14:26:04 by SpooK
I always wondered: If such 'smart & quick' division methods exist, then why Intel/AMD won't make their 'div' instructions use these methods? If I can 'div' in 'X' cycles, then why not make a 'div' instruction which divides in 'X' cycles using this algorithm? I wouldn't be much more expensive, would it..?
Posted on 2006-11-24 19:01:57 by ti_mo_n

I always wondered: If such 'smart & quick' division methods exist, then why Intel/AMD won't make their 'div' instructions use these methods? If I can 'div' in 'X' cycles, then why not make a 'div' instruction which divides in 'X' cycles using this algorithm? I wouldn't be much more expensive, would it..?

You must realize that, in order to get the "magic number" required to replace the division by a multiplication, you must FIRST perform a division. THEN, the time required to perform that division PLUS a multiplication would take more time than the single division.

The only time that the multiplication by the reciprocal is time saving is when you either set a constant's reciprocal in the .data section OR you can use the reciprocal of a variable more than once. That knowledge CANNOT be built-in on a chip.

Another factor which must be considered is that some of the "magic numbers" suggested have been expanded to improve overall accuracy when used with large numbers. Those require the subsequent use of shifts to provide the correct answer and everybody knows that such instructions can be slow on some processors.

Raymond
Posted on 2006-11-24 20:10:02 by Raymond
Don't tell me that the magic numbers are the only fast-division method? I'm sure that smart mathematicians could develop something faster if they really wanted :P Or they could at least make some frequently used divisions faster..? What I mean is: Is it really so hard to make fast (and still cheap) division?
Posted on 2006-11-24 21:05:06 by ti_mo_n
ti_mo_n,

What I mean is: Is it really so hard to make fast (and still cheap) division?

As Raymond said, if you can pre-compute the inverses beforehand with a compiler or manual program, then no.  Otherwise it takes a intimate knowledge of number theory and hardware engineering to make the chip to do it. Ratch
Posted on 2006-11-24 22:09:10 by Ratch
Mathematically there is a quick division replacement
but computationally it is much slower because of the overhead
(getting the logs and the inverse log).

The exponential (inverse natural logarithm)
of the difference of the natural logarithms of the dividend and divisor.

20 = 100 / 5
20 = exp(ln(100) - ln(5))
Posted on 2006-11-26 17:55:24 by dsouza123

"Magic Divider" by The Svin does the trick for every division by constant.

That is a great little utility that I make constant use of, I have included it in this post for those who do not have it. I hope The Svin doesn't mind...

Posted on 2006-11-26 21:45:52 by donkey
do you all live in the dark ages?, chip is already builtin your x86 to perform RCP lookups for you, I timed a RCPPS followed by a MULPS 5.5 cycles and it equalivent to 4 lowprecisionfdivs, higherprecision you use RCP with newton-rapson
integer divs? there are floattoint conversion that converts two reciprocals for you at a time
Posted on 2006-11-27 03:09:44 by daydreamer
and please remind me of the precision of the methods you mentioned :) .
Posted on 2006-11-27 05:24:07 by Ultrano