I am constantly searching for efficient arbitrary precision math routines written in x86 assembler.
I work on the 3X+1 number theory problem a lot, so bignum adds and shifts are very
important to me. I'd sort of like to use asm code to enhance some C code I already have.
The C code wastes a trivia amount of CPU time...except for the parts I want to replace with
ASM code.

Best regards,
Mark Gibson
Posted on 2004-08-01 14:40:46 by DOKDOK
One of the things I'm thinking about, in addition to using ASM code is to put together custom hardware in ASIC.
In my experience, the only thing faster than assembler is doing it in hardware tailored to the task.

Any ideas?

Best regards,
Mark Gibson
Posted on 2004-08-01 14:44:58 by DOKDOK
Check out the GMP source code for x86 version covering several processors. They are fairly easy to code as well. The following routine calculates the longest sequence for starting numbers under one million.
	mov edx, 1000000

xor ebp, ebp ; max chain
xor ebx, ebx

next: dec edx
je done
cmp ebp, ebx
mov eax, edx
cmovc ebp, ebx ; save chain length
cmovc edi, edx ; save number
xor ebx, ebx
jmp _even
_odd:
lea eax, [eax][eax*2][2]
inc ebx
_even:
inc ebx
shr eax, 1
je next
jc _odd
jmp _even
done:
inc edi ; answer
The multiply by 3 and add can be done at <4 cycles per dword in the cache. The divide by 2 and test for end can be done with MMX at <2 cycles per dword.
Posted on 2004-08-01 15:14:29 by bitRAKE