Here is a program to increment a 10 byte counter using aaa. Sorry it is in HLA, but I hope it can be easily converted.

The heart of the routine is the loop at a_id. Can anyone make an MMX optimization for that loop?

The program counts to 100 million. Because I am using aaa, and stepping backwards with ecx until there is no carry in the addition, after the loop, the value of ecx says how far I stepped back:

Initial value 9,
If ecx=8, then no carry to 2nd least significant digit
If ecx=7, then carry to 2nd least
If ecx=6, then carry to 3rd least
If ecx=2, then carry to 7rd least; ie to the millions digit...
This is used to only print the entire number in counts of millions...




// Increment Check Program
// Increments a 10 byte unpacked BCD counter using aaa


program IncCheck;
#include( "stdlib.hhf" );

static (4)
numdig: tbyte;
t:time.timerec;


begin IncCheck;

console.cls();
console.gotoxy(4, 15);
stdout.put ( "Increment Check program.", nl nl);

// increment numdig (BCD)

incr_dig:
mov (&numdig,ebx);
mov (9, ecx);
stc;
a_id:
mov ([ebx+ecx], al);
adc (0, al);
aaa;
mov (al, [ebx+ecx]);
dec (ecx);
js check;
jc a_id;

check:
// cmp (ecx,0);
//use -1 to count 10 times as much
je finisht;

cmp (ecx,2);
ja incr_dig;

stdout.put ( "Done ");
xor (ecx,ecx);

llll:
mov (&numdig,ebx);
mov ([ebx+ecx], al);
stdout.putu8(al);
inc (ecx);
cmp (ecx,9);
jbe llll;

stdout.put ( " increments so far", nl );

jmp incr_dig;

finisht:
stdout.put ( nl, "Finished performing ");
xor (ecx,ecx);

llm:
mov (&numdig,ebx);
mov ([ebx+ecx], al);
stdout.putu8(al);
inc (ecx);
cmp (ecx,9);
jbe llm;

stdout.put ( " increments.", nl );

end IncCheck;
Posted on 2003-04-30 21:26:07 by V Coder
Here is another version (sorry, HLA again) which simply increments a 32 bit value at incr_dig.

Question: How to check if you have reached x million? Divide the 32 bit number by 1000000, and check for remainder. If zero, then print counter...



// Increment Check Program
//


program IncCheck;
#include( "stdlib.hhf" );

static (4)
numdig: int32;
t:time.timerec;


begin IncCheck;

console.cls();
console.gotoxy(4, 15);
stdout.put ( "Increment Check program.", nl nl);

incr_dig:
inc (numdig);

check:
mov (numdig, eax);
cmp (eax, 100_000_000);
je finisht;

xor (edx, edx);
div (1000000, edx:eax);
cmp (edx, 0);
jne incr_dig;

stdout.put ( "Done ", numdig, " increments so far", nl );

jmp incr_dig;

finisht:
stdout.put ( nl, "Finished performing ", numdig, " increments.", nl );

end IncCheck;
Posted on 2003-04-30 21:31:13 by V Coder
Well, since 64 bit division using EDX:EAX is so slow, is there another way to check whether you have reached x million?

Yes. Check for first million. When you reach 1 million, print it, and add 1 million to the check figure, so that next comparison will be 2 million. And so on. Termination, as with the previous example is at 100 million.

A 32 bit number can hold up to over 4 billion. That's plenty counting!!!!



// Increment Check Program
//


program IncCheck;
#include( "stdlib.hhf" );

static (4)
numdig: int32;
nextn: int32:=1000000;
addt: int32:=1000000;
t:time.timerec;


begin IncCheck;

console.cls();
console.gotoxy(4, 15);
stdout.put ( "Increment Check program.", nl nl);

// increment numdig (BCD)

incr_dig:
inc (numdig);

check:
mov (numdig, eax);
cmp (eax, 100_000_000);
je finisht;

cmp (nextn, eax);
jne incr_dig;
add (addt, eax);
mov (eax, nextn);
stdout.put ( "Done ", numdig, " increments so far", nl );

jmp incr_dig;

finisht:
stdout.put ( nl, "Finished performing ", numdig, " increments.", nl );

end IncCheck;
Posted on 2003-04-30 21:36:22 by V Coder
Here is a zip with the source and executable for all three routines.

Execution times on a 166 MHz Pentium MMX are:

countt {using aaa}, 11secs
countt2 {using inc and div}, 36 secs
countt3 {using inc & add/comp}, 6 secs

{edit}

I'd love to hear your times and comments.
Posted on 2003-04-30 21:43:08 by V Coder