Okay, it's not always the case but it's quite high possibility.

I think this algorithm is much better than brute force since you can do some qualified guesses of digits. It's also good since you don't need any multiplications nor divisions if you work in base 2,4,8,etc. just shifts and adds. Don't care about the fact that the algorithm produces much text when written down.
Posted on 2002-05-17 05:21:10 by gliptic
hi

maybe eliminating some numbers will actually be slower than testing them, that' s the whole stuff...
and still we need a correct mathematical description of the algorithm. brad: giovanni said he had the algo, if he doesn' t have it, he should have said 'i have remarked something when multiplying some numbers, that might be used', but not claiming he can factor RSA-1024 with pens and papers in 6 months. there is no point in doing such claims. yet, if he now says he only saw some strange 'geometric patterns' or whatever you' ll call that, i' ll tell him, and tell you, i don' t see the point in this, since it can' t efficiently be written as an algorithm or whatever. mathematics are serious =)
Posted on 2002-05-17 12:05:22 by roy
The reason it can't be written down is because it's hard to see why this is working and why you should choose numbers which has the most digits right. Correct me if I'm wrong but isn't this the only algorithm that actually can predict the correct digits instead of testing them all. Doing a divide-with-primes-in-a-list with 576-bit numbers is somewhat slow so it is indeed faster to exclude as many as possible than to test them.

I've done a little research, and I came up with a procedure that can factor a 38-bit number (containing two prime factors of course) in 1.2 milliseconds on my 500 Mhz PIII. With a simple divide-with-all-odd-numbers method, it took 61 milliseconds. My procedure uses a fast way to divide when a quotient and remainder is known since the last division.

BTW. Here is a nice code to check if a number is divisable with 3 (untested):

;eax = number

@@:
shr eax,1
sbb eax,0
ja @B

;if eax == 0 then number % 3 == 0
Posted on 2002-05-21 00:37:40 by gliptic
Picking the one with the most correct digits does not predict the correct one. Your own example shows that.

There's a few optimizations that reduce the amount of multiprecision multiplication. But you're still using the trial-and-error strategy of finding the factors. Even Giovanni mentioned the need to backtrack (he called it "rolling back"). That is an admission that he can't predict the correct digits.
Posted on 2002-05-25 04:24:03 by tenkey
One of the unavoidable tasks is to write a program which does exactly what I'm doing by hand, and this project is an *attempt*. I am sure it has a lot of bugs and does not work yet as I wish, but I am on the road. I decided to this step, because I've seen a lot of programs by now, which are not satisfying at all. The main problem is the tendency to proceed with a deep analisis, which results always to extremely long calculations, or to grow on one factor, while the other stays small.

Programming this algo drives my friends mad :) We have had now five or six meetings, and with three of them I showed how this works. After hours of copy and paste from the calculator back and forwards to the documents, it seemed as if they got the idea, but they haven't. So I sat down and began this program, and I found, the BCDLib was a good tool, but only for showing numbers :) I stopped there and wrote a new BNLib, which is now to be tested.

The program is tremendously slow, because of the graphical "stack" and the unoptimized BCDLib, which is complicated to use due to some missing functions (which won't be added). The only advantage is to see what happens. It is truly buggy with little primes, but the 10147 sample works fine: 14 steps.

The next version will have implemented the code for the new BNLib, and further versions will contain the sliding bookmarks, the carry report and some pattern analisis, which are missing now. The actual version does *only* deep analisis, which is good for finding a good new ending.


Giovanni
Posted on 2002-05-26 13:54:34 by sch.jnn
You should compare the 14 step test with the result of the Quadratic Sieve or any of the other algorithms considered the fastest. Check out mathworld.com if you don't know what I mean.
Posted on 2002-05-27 00:28:00 by gliptic
For those still following along:
http://swox.com/gmp/manual/Exact-Division.html
Posted on 2002-06-01 14:08:19 by bitRAKE

For those still following along:
http://swox.com/gmp/manual/Exact-Division.html


Oh, l?-l? :) Just what I've been looking for. Thanks for the many other suggestions, too, especially to the Mathworld link.

Giovanni
Posted on 2002-06-03 02:22:02 by sch.jnn
Bitrake,
that was a very interesting link with some info that was new to me, but on closer analysis I noticed an omission and a contradiction that should be clarified. Both are probably due to oversimplification and the author assuming that the reader is already familiar with the particular implementations, though not yet at expert level.

I myself never saw this stuff, or those methods before, so it took me a while to figure it out. Having done so I felt it best to share it with you guys, to spare you the trouble of doing it for yourselves.

The omission.
-----------------
Nowhere on that page is the term "Modular Inverse" defined. As used in the example at the top of the page (digit-by-digit exact division) the definition implied is:

The modular inverse of a digit N is the digit M such that ((N*M) mod 10) == 1;

Note that this means that most digits do NOT have any modular inverse, and also that some digits are their own modular inverse...

The defined ones are 1,3,7,9 whose inverse digits are 1,7,3,9 respectively.

At first glance that may seem like a problem, but in fact it isn't. In fact all prime numbers above 5 will have one of those four digits as final digit. That is an inescapable consequence of how prime numbers are defined. An interesting fact is also that most of the above remains true in any other number base than 10. The only exceptions are the inverse digit values mentioned, as those will vary with the chosen base (except for the digit 1, which is always its own inverse). For example, in base 100h, the digits 0Bh and 0Dh have modular inverse digits 0A3h and 0C5h, respectively.

I think base 100h is ideal for optimized calculations, as each 'digit' is the same as one byte and much arithmetic can be done with unmodified source data from the memory arrays. It still allows use of a byte indexed table for the inverse digits, and both data storage and transfers can be better optimized than is possible for lower bases.

The contradiction:
----------------------
Further down on that page the term 'modular inverse' is also used about the hex number "0xAA..AAB". This is an INCORRECT usage of the term. What is meant here is that the quoted hex number is the normal inverse value of 3 in a fix-point format. That has nothing to do with the modulo based method at the top of the page.
Posted on 2002-06-04 04:52:48 by RAdlanor
AmkG, (and all others who have posted on the same subject)

I had forgotten your post of May 6th in this thread, until now when I reread the whole thing again.
(Sort of skimmed through it earlier, as I was too busy then.)
I realize that I'm restating some things that have been said before, but not in as general form as I'm trying to do here.

If we combine what you said about finding extended patterns in the final digits of prime numbers with what I said about the digit relationships in higher bases, then we arrive at some very interesting conclusions...

Regardless of the number base used, all prime numbers larger than the base value will always end in a 'digit' which does have a modular inverse 'digit' in the given base.
NB: I have not formally proven this, and I am not absolutely certain of where the valid range for the assumption begins. In fact I believe that the valid range starts even lower than the base value, in which case there's still no problem.

This means that extending the base automatically widens the final pattern. However, using bases larger than 65536 is likely to make implementation impractical, as normal CPU instructions may no longer suffice to calculate digit cross-products, and the tables for inverse digits may become huge and slow to compute.

Another interesting thought is to consider the case of a product of two primes, as expressed in a number base equal to one of those primes (either one will do). Those two number bases are the only ones in which such a product will have zero for the final 'digit'. That is a definite final pattern, but the problem is to find it without knowing either factor in advance. So this is just an interesting thought, but can hardly help in finding a solution. All it does is to illustrate that the base chosen matters significantly.

A third thing to consider is that other number bases are NOT equal in how easy they make recognition of primes. It will be easier if the number of digits with modular inverses is a lower percentage of the total number of digits. Here is a small table illustrating some of the relationships for a few bases:



Base_______: 001 002 003 004 005 006 007
Inversibles: 001 001 002 002 004 002 005
Percentage_: 100 050 067 050 080 033 071

Base_______: 008 009 010 011 016 030 210
Inversibles: 004 006 004 010 008 008 040
Percentage_: 050 067 040 091 050 027 019


This makes a pattern visible. Prime bases are the worst, and the best ones are those which are products of independent low digits. But 'independent' implies primes, so going for higher bases than 210 gives little improvement.

eg: Base 2310 gives 17.3%, which isn't much better than the 19% for base 210. The latter still lets you identify appx 81% of all numbers as non-prime just by testing a single byte (using a byte indexed constant table). Only the other 19% require tougher tests.

However, for heavy computation I still think base 256 is better...

AmkG (again), concerning your post of May 8th, dealing with factor lengths. If a product of two factors is 56 bits, then one of the two factors can always be expressed in 28 bits or less, regardless of the other factor... Thus the probability of different lengths doesn't matter, and your method should be reversed. So you should start at 28 bits and work your way down, not up.
(Because (2^56-2^28) is almost as much as (2^56))

Note that there is never any need to test longer factors than 28 bits (half the product length that is), as only one of the factors can be so large, and that can be found by division in testing a potential lower factor.

The one exception to that is if you're trying to build both factors by a kind of successive approximation method, with feedback, which is what I assume sch.jnn and bitRAKE (and others) are discussing.

Btw: I do realize that it isn't normal 'successive approximation' as it analyses only the low digits, expanding upwards gradually. I just don't know a good term for it.
Posted on 2002-06-04 12:59:07 by RAdlanor
How many iterations do the best methods need to solve the product 10147? Is 6 a good result (given the small product)?

And e.g. for the product 1198413947?
Posted on 2002-06-05 05:15:32 by Maverick
Here is the new BNLib. Sorry I'm extremely busy these days.

Giovanni

PS: Maverick, 6 iterations would be great!
Posted on 2002-06-05 17:45:45 by sch.jnn
grrrr, now that i've read this thread i'm hooked too, currently filling lots of pages with binary numbers,,,,,,
Posted on 2002-06-06 04:44:04 by Terab
any of you checked out Ulam's rose?

http://www.abarim-publications.com/plaatjes/Ulam's Rose.JPG

the above picture is appantely a Ulam spiral of primes up to 262000 or so

looks like kinda almost a pattern, but not :-)

http://www.abarim-publications.com/artctulam.html
Posted on 2002-06-06 16:45:36 by Terab
I am quite fascinated by this Ulam rose thing - never heard of it before

I wonder what it would look like in higher dimensions and then maybe cut planes out of it - maybe a clearer pattern?

Much like the way various things in physics combine in higher dimensions, (10th dimensional hyperspace theory)

now to write a program that does that for me, o for those of you working on the big number library - i *might* have found a faster multiply (still working on it)

This whole thing is now consuming waaaay too much time (and i'm enjoying every second :-)

so maybe there is a pattern......
Posted on 2002-06-07 01:58:37 by Terab
Certainly there is a pattern.. working on it myself too.

I invented my own (verifyed to be original) method for factorization.. gives 10147 in 6 steps, but it won't attack nothing like the big RSA numbers. There's much room for improvement though.. although the big RSA numbers seem really *big* whatever improvement one may do.

Working on it in my spare time, though. It's really very fascinating. The line I'm working on at the moment is very geometrical, and to define numbers in a different than the usual way.. I'm confident that that will show the patterns.
Posted on 2002-06-07 05:59:06 by Maverick
this is so much fun, wish a had a bit more free time to play with this :-)

gotta love a challenge and rsa 2048 is quite a big one :-)

if i get a chance this weekend i'll try to write a big number lib - i know there is one already on the board - but it seems like a really fun project
Posted on 2002-06-07 10:05:34 by Terab
two interesting links

http://www.loria.fr/~zimmerma/records/gnfs158

and

http://www.loria.fr/~zimmerma/records/factor.html


check em out......

imho even if you could factor 1 000 000 000 factors a second a 2^2048 would take a VERY long time, the only way therefor is to discover the pattern which is what is proposed here, this is a great challenge

and i am now soooooo addicted <somebody kiiiilllll me please :-)
Posted on 2002-06-08 04:25:03 by Terab
Hi Guys,

Terab:
I too have started on a Big Number lib, but 'the more, the merrier'... I'm using C++builder for my test program but with the BIGNUM stuff in almost pure asm, except for C function wrappers. (not calling wrappers, just C functions holding asm code)

That 'lib' is almost complete, except that I'm still not happy with the 'divide BIGNUM by BIGNUM' efficiency. (Still working on it.) When all is complete and tested, I'll extract the useful stuff from my test program and make a proper lib out of it.

Maverick:
I'm impressed by your 6 steps for 10147, but I guess it also depends on what one 'step' really contains. My current algo takes 15 steps for that one...

Btw: I am myself using a method that doesn't need to divide the modulus itself in each step. That is only done in the first step of a run, and all subsequent steps can use lower numbers. I'd like to develop a method without any division at all though...
Posted on 2002-06-10 23:59:52 by RAdlanor
RAdlanor: simple add and cmp of at a (very rare) maximum log2(Product)/2 size (i.e. 1024 bits on a 2048 bits key), plus another operation quite quick as well (not as quick and simple as the previous, though). Also, it doesn't have to compute the "big" numbers at the begin.. but does only 32 bits at the begin, then 64 bits, then 96 bits, etc.. and doesn't even have to reach log2(Product)/2 size, since the solution may be found sooner than that.

The problem is that those 6 steps will grow very much soon. I solved 64 bit products in very little time, but it clearly appears that 2048 would take much more than just a 2048/64 proportion.. this is the problem. A QuadriSieve or a NFS still behave much better on big numbers, but as soon as I've some free time I will test an idea I had which could skip a lot of tests that would anyway fail.. and make things faster. How much then will be the problem again.

I wish I could have more time to dedicate to this challenge, because as Terab said, it's really amazing stuff.. and even if I'll probably fail to find a better method than NFS, there are many interesting things I will learn in the process. Really worth the efforts in any case.
Also, as soon as I know that my method won't be better than NFS, I will release it to the public domain here (in the very improbable case than it's better than NFS I would do it a bit later ;) ).
Posted on 2002-06-11 02:51:18 by Maverick