i have some code that tests for prime. its extreemly faster then *simple* modulo comparing, but extremmly slower then a seive. i came up with it my self actually. took 10.4 hours to do 256^4 primes on my athlon xp 1600. ive yet to test it again on my new athlon 3000+. im ready for the next step. 64bits. only compare and division operations are done. and a square root. it will divide by every number under its square root. and compare the remainder to 0, if its 0 it isnt prime.

heres the currrent code. its inline assembler in a c++ prog.

what i really need it a mmx square root algo. doesnt have to be in mmx code, but someting that i can put into a mmx reg.i also need to know how to compare.

heres the currrent code. its inline assembler in a c++ prog.

what i really need it a mmx square root algo. doesnt have to be in mmx code, but someting that i can put into a mmx reg.i also need to know how to compare.

```
```

__asm {

mov ecx,5//ecx is our number to be tested

xor edi,edi

add edi,2

xor esi,esi//esi is our memory address

mov esi,eax

looper:

cmp ecx,1// had to add other wize it would wrap over from -1

je _endaa

xor ebx,ebx

push ecx

call fred_sqrt

mov ebp,eax//ebp is our sqroot

pop ecx

//prefetchtnta dword ptr [esi+ebx*4]

_mnl:

cmp dword ptr [esi+ebx*4],ebp

jnle nlcpr

xor edx,edx

mov eax,ecx

idiv dword ptr [esi+ebx*4]

or edx,edx//edx is our remainder/eax temp

je _nevn

inc ebx// index

jmp _mnl

nlcpr:

mov dword ptr [esi+edi*4],ecx

inc edi//our index to our prime

_nevn:

inc ecx

inc ecx

jmp looper

}

_endaa:

__asm mov prma_a,edi

a=0;

aFile = fopen ("primes.txt","at");

The x86-32 divide instruction is not sufficient for 64 bit numbers due to overflow - have to use a couple of them:

If you just want to know if a 64 bit number is prime then it might be better to use:

http://mathworld.wolfram.com/StrongPseudoprime.html

Have you read the Prime Pages? Despite my lacking knowledge of mathematics I was able to understand the explanations given. http://www.utm.edu/research/primes/prove/index.html)

Could adapt

Dr.Math suggests using Lucas tests to find if a number is prime - I haven't tried it, but Dr.Math gives a very easy explanation.

```
xor edx, edx
```

mov eax, BIGINT + 4

div Small_Prime

mov eax, BIGINT

div Small_Prime

; edx is remainder

This can be put into a loop to divide any large integer by a 32 bit prime. This thead might provide some enlightenment on the square root of large numbers.
If you just want to know if a 64 bit number is prime then it might be better to use:

http://mathworld.wolfram.com/StrongPseudoprime.html

Have you read the Prime Pages? Despite my lacking knowledge of mathematics I was able to understand the explanations given. http://www.utm.edu/research/primes/prove/index.html)

Could adapt

**Hagen**'s SPRP test to 64 bits.Dr.Math suggests using Lucas tests to find if a number is prime - I haven't tried it, but Dr.Math gives a very easy explanation.

The hints of bitRAKE are the essential way to go.

For 64 Bit numbers it's sufficient to test the number with SPP tests to all small primes 2,3,5,7,11,13,17,19,23.

If the number pass all these SPP tests then the number is prime.

This way is far faster and have far smaller complexity as your trial division approach. But it is recommended to test the number as first step to some small primes with trial division. I personaly use the first 32 primes for this trial division. This tests split out about 80% of all numbers. To make this test fast we don't need to do divisions. Instead we can use only multiplications if we use a "division" trough multiplication to SmallPrime^-1 mod 2^32.

In SSE2 there exists no fast way to do a universal division because SSE2 have no division instruction. Thus using SSE2 need first to translate the SPP test with division into one without Divisions. That's possible if the modular exponentation in the SPP tests use the Montgomery Trick. The Montgomery trick exchange on modular division with 2.5 Multiplication and 1.5 additions. If this is ready then we can go to use SSE2 to make the native algo. faster.

Currently a 32 Bit IsPrime() function using all above tricks implemented in IA32 assembler need with random numbers in range upto 2^31 - 2^32 about 800 clock cycles.

Using a sieve to get sequential all 205 Million primes upto 2^32 need only 13 seconds, about 0.1 clock cycles per number.

Hagen

For 64 Bit numbers it's sufficient to test the number with SPP tests to all small primes 2,3,5,7,11,13,17,19,23.

If the number pass all these SPP tests then the number is prime.

This way is far faster and have far smaller complexity as your trial division approach. But it is recommended to test the number as first step to some small primes with trial division. I personaly use the first 32 primes for this trial division. This tests split out about 80% of all numbers. To make this test fast we don't need to do divisions. Instead we can use only multiplications if we use a "division" trough multiplication to SmallPrime^-1 mod 2^32.

In SSE2 there exists no fast way to do a universal division because SSE2 have no division instruction. Thus using SSE2 need first to translate the SPP test with division into one without Divisions. That's possible if the modular exponentation in the SPP tests use the Montgomery Trick. The Montgomery trick exchange on modular division with 2.5 Multiplication and 1.5 additions. If this is ready then we can go to use SSE2 to make the native algo. faster.

Currently a 32 Bit IsPrime() function using all above tricks implemented in IA32 assembler need with random numbers in range upto 2^31 - 2^32 about 800 clock cycles.

Using a sieve to get sequential all 205 Million primes upto 2^32 need only 13 seconds, about 0.1 clock cycles per number.

Hagen

if you could help me convert IsPrime() into 64 bits that would be great, since i made a complete list of all primes 2 to 2^32