Hi all,

To compute the fraction of a float with SSE, I'm trying to use the following:

This corresponds almost completely to y = x - floor(x). Unfortunately, it doesn't work correctly for x = 1.0. The output should be 0.0 but it's 1.0.

So I was wondering whether anyone knew a way to make it behave exactly like y = x - floor(x).

Thanks!

To compute the fraction of a float with SSE, I'm trying to use the following:

// HALF = 0.5

// MAGIC = 12582912.0 (2^23 + 2^22, forcing fraction bits out of the mantissa)

movss xmm0, x

subss xmm0, HALF

addss xmm0, MAGIC

subss xmm0, MAGIC

movss xmm1, x

subss xmm1, xmm0

movss y, xmm1

This corresponds almost completely to y = x - floor(x). Unfortunately, it doesn't work correctly for x = 1.0. The output should be 0.0 but it's 1.0.

So I was wondering whether anyone knew a way to make it behave exactly like y = x - floor(x).

Thanks!

SSE

A bit optimized, I guess:

FPU

movss xmm0,x

movss xmm1,xmm0

subss xmm0,HALF

cvtss2si eax,xmm0

cvtsi2ss xmm0,eax

subss xmm1,xmm0

movss y,xmm1

A bit optimized, I guess:

movss xmm0,x

subss xmm0,HALF

cvtss2si eax,xmm0

movss xmm1,x

cvtsi2ss xmm0,eax

subss xmm1,xmm0

movss y,xmm1

FPU

fld x

fld ST

fsub HALF

frndint

fsub

fstp y

I know the float->int and int->float conversion instructions, but unfortunately they are quite slow. For vectors it also requires extra movhlps instructions. The approach with the 'MAGIC' number keeps everything in the floating-point pipeline and is very compact. So I was hoping that maybe someone knew how to correct it for x = 1.0, preferably without extra overhead. Maybe I'm asking for the impossible but it would be really nice for the applications I'm working on to have a really fast way to compute a float's fraction.

You were almost there you just had to do two extra checks

1- Make sure the result is 1 it gets changed to 0

2- Mask out those pesky negative values

Here's a packed single FP vector version

It should work for all positive and negative numbers.

x-Floor(x)

1- Make sure the result is 1 it gets changed to 0

2- Mask out those pesky negative values

Here's a packed single FP vector version

.data

align 16

MAGIC dd 12582912.0, 12582912.0, 12582912.0, 12582912.0

HALF dd 0.5, 0.5, 0.5, 0.5

ONES dd 1.0, 1.0, 1.0, 1.0

MASKSIGN dd 7FFFFFFFh, 7FFFFFFFh, 7FFFFFFFh, 7FFFFFFFh

.code

;;in XMM0 out XMM0

Fraction:

ANDPS XMM0, DQWORD

MOVAPS XMM2, DQWORD

MOVAPS XMM5, DQWORD

MOVAPS XMM3, XMM0

MOVAPS XMM1, DQWORD

SUBPS XMM3, XMM2

ADDPS XMM3, XMM1

SUBPS XMM3, XMM1

SUBPS XMM0, XMM3

MOVAPS XMM4, XMM0

CMPPS XMM4, XMM5, 100b ;!= 1

PAND XMM0, XMM4

It should work for all positive and negative numbers.

x-Floor(x)

That's awesome, thanks!

I had to remove the sign masking to make it work with negative arguments though. I'm trying to get the behaviour of the C floor function. I also don't fully understand why you load HALF and ONES into registers. They're only used once so you can use them directly as source operands and save a few instructions (and registers). When executing it in a loop it's obviously best to load them from memory into registers first...

I had to remove the sign masking to make it work with negative arguments though. I'm trying to get the behaviour of the C floor function. I also don't fully understand why you load HALF and ONES into registers. They're only used once so you can use them directly as source operands and save a few instructions (and registers). When executing it in a loop it's obviously best to load them from memory into registers first...