Signum function appeared in http://www.asmcommunity.net/board/showthread.php?threadid=1307
The thread itself is not about signum function, so I decided to create one.
Quoting E?in,
Also does anyone know of the best way the code the sgn function in assembly. According to my vb help it works as follows:
This could be annoyingly slow if it had to be coded with conditional jumping. Perhaps theres a neat asm trick out there to do this efficiently.
OK, E?in's requirement is "without Jcc". For single precision numbers, I came up with the following for i686:
(Yes, it's kinda cheating that I use CMOVcc to avoid Jcc, :) but all I could come up with was a routine with one Jcc.)
Double precision version is straightforward. You only need to modify the zero-check part. (and, of course, the argument size.)
The thread itself is not about signum function, so I decided to create one.
Quoting E?in,
Also does anyone know of the best way the code the sgn function in assembly. According to my vb help it works as follows:
If number is Sgn returns
Greater than zero 1
Equal to zero 0
Less than zero -1
This could be annoyingly slow if it had to be coded with conditional jumping. Perhaps theres a neat asm trick out there to do this efficiently.
OK, E?in's requirement is "without Jcc". For single precision numbers, I came up with the following for i686:
; int signum(float x)
; return value in eax
signum proc
mov eax,[esp+4]
mov edx,eax
sar eax,31 ; propagate sign bit
lea eax,[2*eax+1]
add edx,edx ; check 0 including -0
cmove eax,edx
ret
signum endp
(Yes, it's kinda cheating that I use CMOVcc to avoid Jcc, :) but all I could come up with was a routine with one Jcc.)
Double precision version is straightforward. You only need to modify the zero-check part. (and, of course, the argument size.)
I thought about it a little more, and I came up with this for CPUs without partial register stall and without CMOVcc (e.g. i586).
586 optimization gurus will find a better way to layout instructions. ;)
; int signum(float x)
signum proc
mov eax,[esp+4]
mov edx,eax
xor ecx,ecx
sar eax,31
add edx,edx
setne cl
lea eax,[2*eax+1]
neg ecx
and eax,ecx
ret
signum endp
586 optimization gurus will find a better way to layout instructions. ;)
Check this thread. An elegant solution was given by bitrake.
edx = sgn( eax );
[size=12] cdq
cmp eax, 1
sbb edx, 1
adc edx, 1[/size]
edx = sgn( eax );
Check this thread. An elegant solution was given by bitrake.
[size=12] cdq
cmp eax, 1
sbb edx, 1
adc edx, 1[/size]
edx = sgn( eax );
Wow, that's great. One difference between bitrake's code and mine is that his code is for int sgn(int x) and mine is for int sgn(float x). bitrake's code cannot be used for floating point numbers, and mine cannot be used for integers. (Maybe C++ coders can use both for polymorphism. :grin: )
Then why not just do:
[size=12] fld dword ptr [float]
fist dword ptr [mem]
mov eax. dword ptr [mem]
cdq
cmp eax, 1
sbb edx, 1
adc edx, 1[/size]
Because fist may give me an unexpected result depeding on my RC setting. E.g., 0.5 may result in +0 or 1 after fist. In this case, using integer version of sgn() will give you two different results. And, we know that the correct answer should be 1.
Well, you could do this:
cdq
add eax,eax
cmp eax,1
sbb edx,1
adc edx,1
cdq
add eax,eax
cmp eax,1
sbb edx,1
adc edx,1
You don't even need to do that. Now that I'm looking at the IEEE format, you don't need to change a thing.
BitRake's algo will work with floats as well as ints, untouched.
BitRake's algo will work with floats as well as ints, untouched.
You don't even need to do that. Now that I'm looking at the IEEE format, you don't need to change a thing.
BitRake's algo will work with floats as well as ints, untouched.
Right. After some more thinking, I found that, too, except for -0.0.
The problem is the existence of -0.0. Without modification, bitrake's code will return -1 for -0.0.
So, if anyone can come up with a small and/or fast way to convert -0.0 to +0.0, then bitrake's code can be used.
I think Sephiroth3's idea may work. The current code by Sephiroth3 will return -2 for -0.0. Now I'm thinking about making the idea work. I'll come back when I have a solution.
BTW, does anyone know how to handle NaN in this case? All the code posted in this thread will treat NaN as a normal number, which, I think, is not quite right per IEEE spec. (Then, again, returning 0 for NaN is not right either.)
Wait, what I had was wrong. This is what I meant: ;)
cdq
adc eax,eax
sbb eax,0
adc edx,0
inc eax
cmp eax,2
sbb edx,1
adc edx,1
cdq
adc eax,eax
sbb eax,0
adc edx,0
inc eax
cmp eax,2
sbb edx,1
adc edx,1
add eax,eax
setne cl
sbb eax,eax
or al,cl
setne cl
sbb eax,eax
or al,cl
OK, what about this?
It is a minor modification to bitRAKE's code to return 0 when -0.0 comes in.
; int signum(float x)
; return value in eax
signum proc
mov eax,[esp+4]
; bitRAKE's integer sgn
cdq
cmp eax,1
sbb edx,1
adc edx,1
; handle -0.0
add eax,eax
neg eax
sbb eax,eax
and eax,edx
ret
signum endp
It is a minor modification to bitRAKE's code to return 0 when -0.0 comes in.
sgn(x)=x/abs(x)
fsgn MACRO
fld st
fabs
fdivp st(1),st
endm
EDIT: Oops, nevermind. ;)
This will account for -0.0
Only trashes eax
al = sgn( eax );
Only trashes eax
[size=12] add eax, eax
setnz al
sbb ah, ah
add al, ah
adc ah, 0
sbb al, ah
[/size]
al = sgn( eax );
To All,
Have you all seen this link? Look at Absolute Value of within link.Ratch
http://www.df.lth.se/~john_e/fr_gems.html
Have you all seen this link? Look at Absolute Value of within link.Ratch
http://www.df.lth.se/~john_e/fr_gems.html
sgn(x)=x/abs(x)
code:--------------------------------------------------------------------------------
fsgn MACRO
fld st
fabs
fdivp st(1),st
endm
--------------------------------------------------------------------------------
If the value in st(0) is 0 (whether it's +/-0), the "fdivp" instruction will generate an Invalid operation exception when dividing 0 by 0, and load st(0) with the INDEFINITE NaN value as a result.
code:--------------------------------------------------------------------------------
fsgn MACRO
fld st
fabs
fdivp st(1),st
endm
--------------------------------------------------------------------------------
Raymond
The mods are slackin'. ;)
This thread should have been moved to the Algos forum days ago.
This thread should have been moved to the Algos forum days ago.
This will account for -0.0
Only trashes eax
[size=12] add eax, eax
setnz al
sbb ah, ah
add al, ah
adc ah, 0
sbb al, ah
[/size]
al = sgn( eax );
IMHO, too many instructions
to place it in al:
add eax,eax
setne ah
sbb al,al
or al,ah
al = sgn( eax )
Svin,
That doesn't work with -0.0
Mine does.
That doesn't work with -0.0
Mine does.