I'm trying to find the maximum of the absolute value of an array of numbers.
Assuming: eax = new sample (next number in array)
ecx = current maximum
edx = scrap register
I currently use:

add eax, edx
xor eax, edx
cmp eax, ecx
jle (next sample)
mov ecx, eax

I'd like to do this without branching, though. I came up with:
add eax, edx
xor edx, eax
mov eax, ecx
sub eax, edx
and eax, edx
sub ecx, eax

Can anyone come up with something faster? (I only have the one available scrap register and it must run on all processors, so no conditional moves, etc.)

Posted on 2002-07-31 06:21:37 by kmart9200
This should work for you. Assume eax is the current value. Also assume ecx contains the running maximum value. There is no compare to be done, just loop through it. Dunno if you can get much faster, the trouble is all the forward dependencies. I'll think about it some more, but until then...

cdq ;also mov edx,eax/sar edx,31
add eax,edx
xor eax,edx ;eax should now equal abs(eax)
sub ecx,eax ;assume we hold the current max in ecx
sbb edx,edx
not edx
and ecx,edx
add ecx,eax ;ecx contains max (old max,eax)

Posted on 2002-07-31 13:28:41 by chorus
You can also use this code for absolute values
    mov     eax, -10

mov edx, eax
sar edx, 31
xor eax, edx
sub eax, edx
Just my 2 cents. :)
Posted on 2002-07-31 13:43:19 by stryker
Stryker, do you happen to know if the mov edx,eax/sar edx,31 is any faster than the cdq? I mentioned it briefly above that you could use the method you posted, but I'm not sure which is actually quicker.

Posted on 2002-07-31 13:55:31 by chorus
I don't know but I've heard cdq is slow on older processors. sar register32, 31 is faster. But since I don't have older cpu's, I can't tell which is faster(based on tests...) :)
Posted on 2002-07-31 13:57:47 by stryker
It can be faster on newer processors, too. But you have be able to space the instructions out to remove dependancies. And you have the side benefit of being able to use other registers besides EDX.
Posted on 2002-07-31 14:01:39 by bitRAKE
Posted on 2002-08-01 21:59:45 by kmart9200