several yeras ago Intel issued application note AP-527 "Using MMX Instructions to Get Bits From a Data Stream". Unfortunately google.com and archive.org doesn't give any usefull references anymore. I only found an article here from Dr. Manhattan from the middle of 2001 and I wonder if someone has a copy of this document? Or does anybody have some experience in getting more performance when parsing bit streams?

right now the core looks this way:

mov edx, pxBand
mov eax,

mov ecx,
shr ecx, 3
and ecx,

mov eax,
bswap eax

mov ecx,
neg ecx
and ecx, 7
shl eax, cl

mov ecx, 32
sub ecx,
shr eax, cl

mov ecx, pulValue
mov , eax

getting single bits out of a DWORD consumes a lot of time and maybe MMX could provide some better performance.

Thanks and take care,


Posted on 2005-11-21 09:30:46 by miracle
Hello miracle. This is the application note. There is also a file that I used in my jpeg decoder, if you want an example. I haven't benchmarked it so I don't know if it's faster than non-MMX instructions. Maybe it can be updated to use SSE2.
Posted on 2005-11-21 12:22:48 by Dr. Manhattan

thanks a lot for the AppNote. :D Will have a closer look at it now and do some benchmarking.

Posted on 2005-11-25 10:02:19 by miracle