Hi all,

I was wondering if anyone has or knows where to find SSE code for inverting a 3x3 matrix. It's not that hard to write my own, but my first attempt requires a lot of shufps operations, which makes it almost no faster than using scalar code... So any help is welcome.

Thanks,

Nicolas

I was wondering if anyone has or knows where to find SSE code for inverting a 3x3 matrix. It's not that hard to write my own, but my first attempt requires a lot of shufps operations, which makes it almost no faster than using scalar code... So any help is welcome.

Thanks,

Nicolas

I tried to find patterns, didn't find anything much, but will share my notes on it, might be of help:

But this only gives you the multiplied pairs, subtracting might have to be done scalar.

X1 = a11

X2 = a12

X3 = a13

X4 = a21

X5 = a22

X6 = a23

X7 = a31

X8 = a32

X9 = a33

X5 X6 X3 X2 X2 X3

X8 X9 X9 X8 X5 X6

X6 X4 X1 X3 X3 X1

X9 X7 X7 X9 X6 X4

X4 X5 X2 X1 X1 X2

X7 X8 X8 X7 X4 X5

multiplications (all are unique) :

(notice secondary columns, we'll pack those next)

X1*X5

X1*X6

X1*X8

X1*X9

X2*X4

X2*X6

X2*X7

X2*X9

X3*X4

X3*X5

X3*X7

X3*X8

X4*X8

X4*X9

X5*X7

X5*X9

X6*X7

X6*X8

xmm regs1,2,3: 1234 5678 9***

xmm regs4,5 : 2345 6789 (read this 4 bytes ahead from the start of the matrix)

5689 = 5678 shuffled with 6789

4679 = 4567 shuffled with 6789

4578 = 2345 shuffled with 5678

next choose which 2 to pair in a xmmreg from X4,X5,X6

But this only gives you the multiplied pairs, subtracting might have to be done scalar.