Hi Guys,
I want to speed up my application a bit, so I want to use SSE...
Of course my prob is, I can't be sure that all things are properly aligned (may happen when I only aggregate parts of a vector).
I think, but I don't know how, that it must work if I check if the address is on a boundery or not. How can I do this?
I suppose I have the base in EAX, so I do a simple:
AND EAX, 16
JNZ Unaligned
...
Aligned SSE-II CODE...
Unaligned SSE-II Code here
Can I do it this way? Or will the jump slow down things horribly?
I want to speed up my application a bit, so I want to use SSE...
Of course my prob is, I can't be sure that all things are properly aligned (may happen when I only aggregate parts of a vector).
I think, but I don't know how, that it must work if I check if the address is on a boundery or not. How can I do this?
I suppose I have the base in EAX, so I do a simple:
AND EAX, 16
JNZ Unaligned
...
Aligned SSE-II CODE...
Unaligned SSE-II Code here
Can I do it this way? Or will the jump slow down things horribly?
Sure, that is commonly done, but it should be "AND EAX,-16". The other option would be to enforce alignment through the application, but that can always be done later - there is always more work to do. :)
I think you should force the data to be aligned to 16. SSE is meant to work with aligned data.
Don't you mean TEST AL,15?
Yeah, TEST AL,15 is the way to check for alignment. I really just wanted him to align the address. :P :oops:
Sure, the data is aligned. The problem occurs when I have to add two vectors, it may happen that I have to sumup parts of a vector, so I can't be sure if the data is aligned in this case.
Although I have to acces an old product to get data from... the data on this side is mostly misaligned.
Although I have to acces an old product to get data from... the data on this side is mostly misaligned.
What about reading everything to memory then make use of the packing and unpacking feature to extract what you need?