Anyone who knows a good Life algo? I've tried two types of algos but I don't think they are optimal. I'm working on a Life app for a monochrome device (my TI-89). Unfortunatly, I must code in C or 68k assembler to do stuff on the TI.
One of the algos keeps track of how many pixels are set in the three columns separatly and adds and subtracts when it moves one pixel forward. The other algo loads in 96 pixels in three DWORDS (upper, middle and lower row) and masks out the three interesting pixels in each DWORD and uses a table to know what to do. Maybe there is an ultimate way. Any ideas?
Posted on 2002-06-06 00:56:43 by gliptic
Maybe, two pass: sum then rotate - color display 4+ bits per pixel?
Posted on 2002-06-06 01:04:40 by bitRAKE
On the Amiga the blitter would be used, and it was extremely fast. Ergo, you may probably++ use MMX et simila.
Posted on 2002-06-06 04:04:08 by Maverick
I too have fond memories of the Amiga and its graphic coprocessor - hardly blisteringly fast by today's standards though... however the 68k chip itself went places with the SunOs SPARC workstations :)
Posted on 2002-06-06 09:12:15 by Homer
Michael Abrash explains in detail a couple implementations in Zen of Code Optimization, Chapters 17-18. One method consists of little more than two table look-ups and a dirty pixel stack -- this was second place(!) in a contest by David Stafford. First place was by Peter Klerings who programmed for Turck GmbH in Munich, Germany. Maybe, you can find something on web. :) Abrash didn't explain first place in the book!! Now I have to go find the code...
Article: 10498
Subject: Re: [++] Fast Life code (Was:Re: FPGA-based CPUs (was Re: Minimal ALU
From: Terje Mathisen <Terje.Mathisen@hda.hydro.com>
Date: Mon, 25 May 1998 10:30:53 +0200

Thomas Womack wrote:
>
> In article <35686FEA.6266@hda.hydro.com> you wrote:
>
> >I wrote
>
> : > I was more impressed by 60fps interactive 800x600 Life - which is
> : > non-trivial, and probably not even possible, on Standard Commercial
> : > Hardware (OK, you can use cell lists rather than an array, but the
> : > FPGA approach ran at constant speed independent of population). I
> : > think the 64k of SSRAM attached to the FPGA helped.
>
> : Mike Abrash used Conway's Game of Life as the target for his second
> : 'Annual Code Optimization Challenge', a few years ago.
>
> : The target machine was either a 486 or very early Pentium, i.e. current
> : hardware is quite a bit faster:
>
> : The two joint winners both achieved twice the performance of my entry,
> : which is still capable of 400 fps in 320x200 resolution on a Pentium
> : MMX.
>
> That seems very interesting; would it be worth putting you in touch with
> the appropriate bit of the Hardware Compilation Group?

Sure, no problem.

> Have you a better reference for this than chapters 17 and 18 of the
> Big Black Book from Abrash? He discusses Stafford's solution in detail,
> ignoring Peter Klerings' one altogether, which seems a pity since Klerings'
> one sounds much closer suited to the FPGA.

Klering's code was actually fairly straightforward, except for a set of
flags used to detect static areas.

Skipping that part would still (most probably) let it run at the
required 60 fps, the code is 'just' a parallel implementation of the
counting logic:

Alive next iteration = (alive now AND (count == 2 OR count == 3)) OR
(not alive AND count == 3),

which simplifies to just:

Alive next iteration = (count == 3) OR (alive AND count == 2).

By including the cell itself in the count, then it becomes easier to
reuse the counting logic for multiple rows:

alive = (iCount == 3) OR (alive AND iCount == 4)

You need 4 bits to count to 8 (or 9), so 4 registers for counting plus
one for the center cells leaves one or two registers for array
addressing on an x86.

Klering did a lot of work to simplify the logic as much as possible,
i.e. he didn't actually implement the full 'count-to-9' bitwise logic,
since it is possible to early-out many of the branches.

Implementing the same logic with MMX-style wide registers should make it
approximately twice as fast.

Terje
The target machine of the contest was a 20Mhz 386!
This algo should be fast enough for most anything. ;)

Get Abrash's book.
Posted on 2002-06-12 21:39:14 by bitRAKE