Hi all,

To compute the fraction of a float with SSE, I'm trying to use the following:


// HALF = 0.5
// MAGIC = 12582912.0 (2^23 + 2^22, forcing fraction bits out of the mantissa)

movss xmm0, x
subss xmm0, HALF
addss xmm0, MAGIC
subss xmm0, MAGIC
movss xmm1, x
subss xmm1, xmm0
movss y, xmm1


This corresponds almost completely to y = x - floor(x). Unfortunately, it doesn't work correctly for x = 1.0. The output should be 0.0 but it's 1.0.

So I was wondering whether anyone knew a way to make it behave exactly like y = x - floor(x).

Thanks!
Posted on 2006-06-01 04:12:18 by C0D1F1ED
SSE

movss xmm0,x
movss xmm1,xmm0
subss xmm0,HALF
cvtss2si eax,xmm0
cvtsi2ss xmm0,eax
subss xmm1,xmm0
movss y,xmm1


A bit optimized, I guess:

movss xmm0,x
subss xmm0,HALF
cvtss2si eax,xmm0
movss xmm1,x
cvtsi2ss xmm0,eax
subss xmm1,xmm0
movss y,xmm1


FPU

fld x
fld ST
fsub HALF
frndint
fsub
fstp y
Posted on 2006-06-01 05:23:11 by Ultrano
I know the float->int and int->float conversion instructions, but unfortunately they are quite slow. For vectors it also requires extra movhlps instructions. The approach with the 'MAGIC' number keeps everything in the floating-point pipeline and is very compact. So I was hoping that maybe someone knew how to correct it for x = 1.0, preferably without extra overhead. Maybe I'm asking for the impossible but it would be really nice for the applications I'm working on to have a really fast way to compute a float's fraction.
Posted on 2006-06-01 10:22:31 by C0D1F1ED
You were almost there you just had to do two extra checks
1- Make sure the result is 1 it gets changed to 0
2- Mask out those pesky negative values

Here's a packed single FP vector version

.data
align 16
MAGIC dd 12582912.0, 12582912.0, 12582912.0, 12582912.0
HALF  dd 0.5, 0.5, 0.5, 0.5
ONES dd 1.0, 1.0, 1.0, 1.0
MASKSIGN dd 7FFFFFFFh, 7FFFFFFFh, 7FFFFFFFh, 7FFFFFFFh 
.code
;;in XMM0 out XMM0
Fraction:
  ANDPS XMM0, DQWORD
  MOVAPS XMM2, DQWORD
  MOVAPS XMM5, DQWORD
  MOVAPS XMM3, XMM0
  MOVAPS XMM1, DQWORD
  SUBPS XMM3, XMM2
  ADDPS XMM3, XMM1
  SUBPS XMM3, XMM1
  SUBPS XMM0, XMM3
  MOVAPS XMM4, XMM0
  CMPPS XMM4, XMM5, 100b ;!= 1
  PAND XMM0, XMM4


It should work for all positive and negative numbers.
x-Floor(x)
Posted on 2006-06-02 20:36:20 by r22
That's awesome, thanks!

I had to remove the sign masking to make it work with negative arguments though. I'm trying to get the behaviour of the C floor function. I also don't fully understand why you load HALF and ONES into registers. They're only used once so you can use them directly as source operands and save a few instructions (and registers). When executing it in a loop it's obviously best to load them from memory into registers first...
Posted on 2006-06-03 14:08:27 by C0D1F1ED