I am looking for the fastest bitfield/bitstring move routine from memory to memory in X86 assembler.
The number of bits to move is up to 8192 (not only 16 or 32...).

The src bit number/position/offset which I start to move from and also the dst bit number are in no way aligned to any byte/word/dword boundary. Especially they both are different aligned!

For EXAMPLE (here with 3 and 4 bits only, in real its from 30 up to 8000 bits):
I have to move 3 bits from source byte address 40000000h bit 4 (that is: bit 4, bit 5 and bit 6 of 40000000h are to move) to destination address 20000000h bit 6 (that is: bit 6 and bit 7 of 20000000h and bit 0 of 20000001h are overwritten).
Then I have to "append" to the destination 4 bits from address 40000004h bit 2 (that is: bit 2, bit 3, bit 4 and bit 5 of 40000004h are moved to bit 1, bit 2, bit 3 and bit 4 of 20000001h).

Momentarily I use BT for testing the source bit and BTR/BTS for resetting/setting the dest bit according to the carry flag after BT (would have been nice to have a BM 'bit move' instruction...).
Unfortunately these instructions are relatively slow, because they lead to 6 microcode instructions.

I had a look on the FORTRAN MVBITS intrinsic command, but that seems to be limited to 32 bits on a byte boundary.

Maybe someone had a similar problem to solve?

Best Regards
Posted on 2003-03-16 05:18:07 by JuergenM
You'll want to use the MMX/SSE2 shift instructions.
Look at the GMP source for examples.

I like the BT/BTS/BTR instructions, too.
Very easy to whip up an algorithm. :)
Posted on 2003-03-16 11:34:12 by bitRAKE