Hey everyone.

So I came up with this little technique while optimizing some special effects here at work. In this particular example, I'm checking to ensure that the X and Y values are within a certain range.

ECX - X value
EDX - Y value



// clear eax
xor eax, eax

// is x > X_MIN?
cmp ecx, X_MIN
seta al
rol eax, 8

// is x < X_MAX?
cmp ecx, X_MAX
setb al
rol eax, 8

// is y > Y_MIN?
cmp ecx, Y_MIN
seta al
rol eax, 8

// is y < Y_MAX?
cmp ecx, X_MAX
setb al

// so now if EAX is equal to 0x01010101 then all 4 conditions were true
cmp eax, 0x01010101
jne Skip

....

Skip:



Each time a condition passes, a 1 is stored in AL. So I just ROL that value to the left and check the next condition. I avoid the prefix penalty by first clearing EAX.

Have fun.

Comments welcome!

Phred
Posted on 2005-02-08 15:23:09 by Phred
How about this? It should be faster, since it's fewer instructions and requires fewer memory reads.
sub ecx,X_MIN
sub edx,Y_MIN
cmp ecx,X_MAX-X_MIN
sbb eax,eax
cmp edx,Y_MAX-Y_MIN
rcl eax,1
inc eax
jne Skip
Posted on 2005-02-08 16:55:54 by Sephiroth3
Hey that's a pretty interesting way to approach the problem but wouldn't there be just as many memory reads if the vars are not defines?

Like to do the X_MAX-X_MIN would still require some inbetween instructions to do the subtraction but I'm curious now, I'll put that code in my code tommorow and do some profiling!

Nice soloution!
Posted on 2005-02-08 21:01:04 by Phred
Phred, your method will result in nasty partial register stalls for Athlons, or PII or above processors (although VIA C3 processors wouldn't see this).

Try replacing the set* with ADC eax, eax.
You can't adjust the polarity of the test, so the final cmp should be "cmp eax, 5", (or possibly "0Ah"). This will avoid the partial register stalls, so should be faster.

Mirno
Posted on 2005-02-09 04:53:51 by Mirno
Mirno, doesn't the XOR EAX,EAX at the very beginning of the code eliminate the partial register stall?

Phred
Posted on 2005-02-09 06:41:48 by Phred
The xor eax, eax will eliminate any previous stalls, but you're mixing al & eax in close proximity. The processor won't have had enough time to fully retire the al instruction and re-unify eax & al.

So every set* al / rol eax, 8 will cause a stall, apart from the first.

Mirno
Posted on 2005-02-09 07:07:04 by Mirno
Ah crap, I was hoping that wasn't an issue.

Thanks for the info!

Phred
Posted on 2005-02-09 08:33:34 by Phred

sub ecx,X_MIN
sub edx,Y_MIN
cmp ecx,X_MAX-X_MIN
sbb eax,eax
cmp edx,Y_MAX-Y_MIN
rcl eax,1
inc eax
jne Skip


Just to mention that you can change the 3 last instructions as follows:


adc eax,1
jne Skip
Posted on 2005-02-10 10:06:10 by MCoder
That won't do, but this would work:

sub ecx,X_MIN
sub edx,Y_MIN
cmp ecx,X_MAX-X_MIN
sbb al,al
cmp edx,Y_MAX-Y_MIN
adc al,0
jae Skip
Posted on 2005-02-10 11:46:03 by Sephiroth3