I can't seem to find a way to do this fast...

I need to increment an 8-bit value in memory with saturation. The fastest method I found so far is:

movzx eax, byte ptr [...]
inc eax
cmp eax, 0xFF
cmovg eax, [constFFh]
mov [...], al

That's five slow instruction! This code is quite critical to me so I wondered if there was a faster way to do it.

Thanks for any ideas!
Posted on 2004-02-28 11:37:42 by C0D1F1ED
Try this:
add ,1
sbb ,0

Or this:
cmp ,255
adc ,0
Posted on 2004-02-28 11:43:48 by Sephiroth3
I always find solutions right after I post something... ;)

This should be faster:

movsx eax, byte ptr [...]
inc eax
cmovz eax, [constFFh]
mov [...], al

I'm already much more satisfied now, since every instruction has a clear purpose: load, increment, saturate and store.
Posted on 2004-02-28 11:45:10 by C0D1F1ED

Try this:
add ,1
sbb ,0

Or this:
cmp ,255
adc ,0

Wow that's short! Thanks! I hope the read-modify-write operation aren't slower. I'll try it out...
Posted on 2004-02-28 11:47:13 by C0D1F1ED
Sorry, my method was nearly twice as fast on my Pentium M.

Edit: I was too hasty. Your second method with the cmp is about equally fast (it's not a read-modify-write operation). I'll make some more accurate measurements...

Edit: Congratulations! Your second method is 10% faster than mine. Thanks!
Posted on 2004-02-28 11:55:37 by C0D1F1ED
I found the equivalent for decrement with saturation:

cmp byte ptr [...], 1
adc byte ptr [...], -1
Posted on 2004-02-28 12:37:13 by C0D1F1ED
Also try doing the operation in a register instead of using RMW instructions, and see if it makes a difference.
Posted on 2004-02-28 12:48:29 by Sephiroth3
For saturated inc-by-1, what about:

add <whatever>, 1
sbb <whatever>, 0

Same logic for dec-by-1:

sub <whatever>, 1
adc <whatever>, 0

For larger-than-1 increments, scali suggested the following:

add <whatever>, <amount>
sbb <reg>, <reg>
or <whatever>, <reg>

Your responsibility to time the stuff...
Posted on 2004-02-28 18:46:59 by f0dder