I have almost finished the user interface, but I'm trying to debug a problem with the thread and timer before I start distrubuting it:
Posted on 2003-06-01 13:44:47 by bitRAKE
bitRAKE

And on the old CPU my loop code only took 20 cycles to execute with the data in the cache - that is 20 cycle/16 digits. With memory latencies the loop time jumps to 70 cycles!


Is this the packed BCD version? How fast is the unpacked BCD version cycles/digits cached/memory?

And how long do the packed and unpacked BCD versions take to do 413280?

I have incorporated Load File, but not autosave. I don't expect to incorporate save every X seconds until I get out of console mode, but I'll work on autosave every X digits or iterations later this week - a minor addition.

I will start considering a packed BCD version by next week too, maybe.
Posted on 2003-06-03 04:29:56 by V Coder
; Ben Despres "

; I discovered a really neat fact about the checksums of the digits generated
; by flip-and-add... At any given step, the checksum of the result will equal
; twice the checksum of the last result minus a multiple of nine ( that
; multiple just happens to equal the number of carries ). At first I thought
; this may lead to a "proof" of Lychrelness, but I have given up that idea
; for now. *BUT*, it *does* lead to a really simple check of whether or not a
; given number could have resulted on a given iteration, including in some
; cases whether or not a number could ever have resulted from a given starting
; number.
;
; Basically, if Cx equals the MOD-9 sum at iteration X, with C0 meaning the
; same for the starting value:
;
; Cx = (C0*(2^(x MOD 6))) MOD 9.
;
; Actually, three distinct cycles exist, with only the longest using MOD 6...
; {0}, {3,6}, and {1,2,4,8,7,5}. 196 itself has the longest of those cycles
; (and no, unfortunately I could not find any patterns between cycle and
; converging-vs-conconverging)."

; this supports up to 64-bits for iterations and starting value
mov ecx, 6
mov eax, iteration.dwHigh
xor edx, edx
div ecx
mov eax, iteration.dwLow
div ecx ; EDX = x MOD 6
xor eax, eax
mov ecx, edx
mov edx, StartingValue.dwHigh
mov ebx, StartingValue.dwLow
shld eax, edx, cl
shld edx, ebx, cl
shl ebx, cl
mov ecx, 9
push ebx
push edx
xor edx, edx
div ecx
pop eax
div ecx
pop eax
div ecx ; EDX = (C0*(2^(x MOD 6))) MOD 9
Posted on 2003-06-12 00:10:17 by bitRAKE
Update.

My (about) fastest DOS Unreal mode program took 19 hours to finish 1 million digits on the 550 MHz AMD k6-2. My 166 MHz Pentium MMX 166 should have taken 50% longer but I never let it finish.

I have just tested both machines with my latest and fastest Win32 routine (no MMX instructions) so should run on Pentium, 486?, 386?.

It took 7 1/2 hours on the K6-2, and 10 1/4 hours on the Pentium MMX.

The k6-2 uses shared memory and I had the screen on 1280x1024. I noticed that the program ran maybe 10% faster with the screen at 800x600.

The Pentium MMX finished 4 times faster than Jason's first program running on a 266 Mhz PII. This is 3 times faster than my unreal mode program. I think I'll let the unreal mode program run again to time it to 1 million digits...
Posted on 2003-07-10 00:16:22 by V Coder
Hi Everyone,

I am running the program on an Athlon 1.4GHz, 1GB 133 SDRAM, with nine other programs running and 60+ processes (I am playing a CD!). The OS is Windows XP Professional SP1.

First checkpoint 171104 digits after 413280 additions at 100 seconds. It is doing 200 digits per second now (260,000 digits).

Charles
Posted on 2003-07-10 21:35:06 by cdquarles
Update,

I let my first Win32 program go to 1 million digits. It took 15 hours 38.6 minutes on the 166 MHz Pentium MMX.

Way to go, Charles.

My first program Win32 took around 158 seconds to do 171104 (on my Pentium III notebook) with no other apps (well, ZoneAlarm), and about 8 processes. My first complete program took 100 seconds. My fastest program takes 67 ? seconds. bitRAKE has timed one of my programs at 38 seconds to 171104 (on his much faster Athlon). His programs are even faster I think.

Tell us how fast your program is without interruption.

If you have a stable product, you can send it to Wade at www.p196.org

Unfortunately, Wade uses a Pentium 4 (2.4 GHz) so many optimizations that work on Athlon don't help (and others that work on P4, won't appear on your testbed).
Posted on 2003-07-11 06:30:35 by V Coder
Dear V Coder,

I repeated the test using palsubm2.exe at the minimum possible for my WinXP SP1 in normal mode. There were 4 programs (including the test program) and 45 processes running. Checkpoint one results: Checkpoint time 73, Initial value: 196, Iteration: 413280, Number of digits: 171104. Checkpoint two results: Checkpoint time 10926 (3 hrs 2 min 6 sec), Initial value: 196, Iteration: 2415836, Number of digits: 1000000. I have not visited your site yet. I will do so now.

Thanks,

Charles
Posted on 2003-07-12 13:40:03 by cdquarles
Okay, cdquarles, I now realize that was one I submitted...

Yeah, it works at around 100 seconds on my machine (and 38 seconds I think on bitRAKE's).

It handles memory pretty badly, so it slows down badly after it starts accessing data out of the cache... I'll need bitRAKE to help me fix that problem... I'll get back to working on it soon.

You can join the fray and code your own too...
Posted on 2003-07-12 23:03:14 by V Coder
took 65 seconds(63 cpu time) to get to checkpoint one. im on amd athlon 1600+ soltek nforce2 ddr 3200 when i get my new 3000+ t'l cut that down to 30 seconds.
Posted on 2003-07-13 17:41:37 by Qages