If you want to think about the problem visually, just imagine a pixel moving around the screen and each itteration it goes north,south,east, or west. We have a 2D array of bits pointed to by EDI. EBX contains the current index into that array. (handy for: BT , EBX). EBP contains random bits. BPL is the number of bits per line. Each itteration two bits from EBP are used to determine the direction of the bit pointed to by EBX.

What is the optimal adjustment to EBX (by 1, -1, BPL, or -BPL) based on the random bits of EBP?

Here is obvious code:
	inc	ebx

add ebp, ebp
jc _BPL
add ebp, ebp
jc _x
sub ebx, 2
jmp _x
_BPL: dec ebx
add ebx, BPL
add ebp, ebp
jc _x
sub ebx, BPL
sub ebx, BPL
_x:
This code is not so good as I would like to find a branchless method of few instructions. Maybe someone here would like a little challenge? I have reduced it down to one branch:
	mov	eax, ebp

rol ebp, 2
sar eax, 30
cdq
sub eax, edx
jne @F
add edx, edx
mov eax, BPL
imul edx, eax
add eax, edx
@@:
add ebx, eax
I cannot help but think this is not even close to optimal - I know there is somthing rather simple that I forget.
Posted on 2004-01-05 00:25:24 by bitRAKE
Guess a table works best:
	mov	eax, ebp

rol ebp, 2
and eax, 11y
add ebx, [Table][eax*4]

Table DWORD 1,-1,BPL,-BPL
Posted on 2004-01-05 00:40:15 by bitRAKE
Can you can arrange the bits so that the first tells you if the result is positive and the second give you the absolute value of the result. The code would be :

0 0 -> +1
0 1 -> -1
1 0 -> +BPL
1 1 -> -BPL

That code should work :



shr ebp, 1 ;// ecx = result < 0 ? -1 : 0
sbb ecx, ecx
shr ebp, 1
sbb edx, edx ;// edx = abs(result) == BPL ? -1 : 0
and edx, BPL-1 ;// edx = abs(result) == BPL ? BPL-1 : 0
inc edx ;// edx = abs(result) == BPL ? BPL : 1
xor edx, ecx ;// set sign
sub edx, ecx
Posted on 2004-01-05 14:50:08 by Dr. Manhattan
Thank you, that works well and I can always spare the space in the instruction cache.
Posted on 2004-01-05 21:14:11 by bitRAKE