Here is a quick code fragment I wrote to calculate the hamming distance between two bytes in registers AL and AH. The Hamming Distance is stored in DX. Can anybody find if I made any errors? Is the logic right in this code? Thanks.
calculate:
cmp AL, 1
je cal1
cal0:
cmp AH, 0
jne no_match
jmp match
cal1:
cmp AH, 1
jne no_match
jmp match
no_match:
inc DX
inc CX
cmp CX, 8
je done
shl AL, 1
shl AH, 1
jmp calculate
match:
inc CX
cmp CX, 8
je done
shl AH, 1
shl AL, 1
jmp calculate
Well, "cal1" is a lousy name for a label - looks too much like the "call" instruction. But it'll work...
As for your logic... try a test case. Imagine al=2, ah=2. My memory is fuzzy on "Hamming distance", but I think it should be zero. First compare fails, so we "fall through". Second compare fails, so we go to "no_match" and bump the counter in dx. No, your logic isn't right. :(
You might want to take a look at the "test" instruction. It'll check to see if just one bit is one or zero.
Better yet, look at the "xor" instruction. If you were to "xor al, ah", bits that were the same in each would be zero, bits that differ would be one. Now (if I remember what "Hamming distance" is) the number of set bits in al would be the Hamming distance. Couning them would be a "population count" - Google for "efficient" ways to do a "population count"... I think there's even a "popcnt" instruction on some CPUs (?).
A naive way to do it would involve the fact that a bit "shifted out" of a register ends up in the carry flag. The "adc" instruction includes the carry flag in its add. "adc dx, 0" after a shift, looped through 8 times, would count 'em for ya.
I suspect that something like that is what you're "supposed" to be doing for this. Your method could probably be made to work, too, but not with "cmp al, 1".
Best,
Frank
As for your logic... try a test case. Imagine al=2, ah=2. My memory is fuzzy on "Hamming distance", but I think it should be zero. First compare fails, so we "fall through". Second compare fails, so we go to "no_match" and bump the counter in dx. No, your logic isn't right. :(
You might want to take a look at the "test" instruction. It'll check to see if just one bit is one or zero.
Better yet, look at the "xor" instruction. If you were to "xor al, ah", bits that were the same in each would be zero, bits that differ would be one. Now (if I remember what "Hamming distance" is) the number of set bits in al would be the Hamming distance. Couning them would be a "population count" - Google for "efficient" ways to do a "population count"... I think there's even a "popcnt" instruction on some CPUs (?).
A naive way to do it would involve the fact that a bit "shifted out" of a register ends up in the carry flag. The "adc" instruction includes the carry flag in its add. "adc dx, 0" after a shift, looped through 8 times, would count 'em for ya.
I suspect that something like that is what you're "supposed" to be doing for this. Your method could probably be made to work, too, but not with "cmp al, 1".
Best,
Frank
Does this look any better?
check:
and AL, 1
jz bit_zero
jnz bit_one
bit_zero:
and AH, 1
jz match
jnz no_match
bit_one:
and AH, 1
jz no_match
jnz match
no_match:
inc CX
inc DX
cmp CX, 8
je done
shl AL, 1
shl AH, 1
jmp check
match:
inc CX
cmp CX, 8
je done
shl AL, 1
shl AH, 1
jmp check
"and" alters the first operand. Is this what you want? Look into the difference between "and" and "test". You're on the right track.
Best,
Frank
Best,
Frank
Isn't it simpler to calculate population (i.e. number of bits set) of al XOR ah? There are many tricks to do that (keywords: bit twiddling hacks), SSE4.2 even include opcode for that.
;;Hamming Distance AL <-> AH
;;USES/DESTROYES AL CX DX
XOR AL, AH ;;AL now has set bits for all differences with AH
MOV CX, 8 ;;set loop counter to 8
MOV DX, 0 ;;ZERO DX
CountAndAccum: ;;loop label for jump
SHL AL, 1 ;;Shift AL 1 bit to the left
ADC DX, 0 ;;Add AL's shifted bit to DX
DEC CX ;;subtract 1 from CX
JNZ CountAndAccum ;;WHILE CX != 0 LOOP
Writing in ASM doesn't given you "open season" to throw out good programming practices.
- COMMENT YOUR CODE SO IT CAN BE UNDERSTOOD
- USE MEANINGFUL LABEL NAMES
r22,
XOR DX, DX is Intel®'s approved method of setting DX to 0.
Obvious comments should be banned, period.
No need for loop counter (how much bits in 00h are 1? ;-).
It shifts ah one more time than needed, but that greatly simplifies loop exit condition.
XOR DX, DX is Intel®'s approved method of setting DX to 0.
Obvious comments should be banned, period.
No need for loop counter (how much bits in 00h are 1? ;-).
;;; Hamming distance
;;; Expects: codes in ah and al
;;; Returns: distance in ax
;;; Uses/Destroys: flags ;-)
xor ah, al ; ah contains 1 in different bit positions
xor al, al ; al will count them
next: shr ah, 1 ; CF contains next bit from ah
ja next ; if neither CF nor ZF set, shift again
jnc done ; if CF not set then ZF set from shr => ah==0 => done
inc al ; got 1 bit, count it
jmp next
done: ; ah == 0, al (and ax therefore) == Hamming distance
It shifts ah one more time than needed, but that greatly simplifies loop exit condition.
Thanks for everyone's help.
An unneeded addendum to the thread follows.
If simplicity/optimization was the purpose then I would of used a LUT (Look Up table) and a single MOV instruction.
I don't see how adding three jumps where there was only one can simplify anything. But removing the loop counter does make sense. The most efficient/simplest algorithmic method would then be...
SHL / LABEL / ADC / SHL / JNZ LABEL ?
Empirically speaking, MOV Reg, 0 vs XOR Reg, Reg, is one of the more useless optimization techniques for GPRs (general purpose registers). But with XMMx registers it can actually make a difference in timing. Some people also try to argue that using XOR obfuscates the code and can introduce errors like XOR CX, DX looking similar to XOR DX, DX etc etc but I'm not one of those people.
Also, the "obviousness" of comments is rather subjective. How asinine it would be if you could trace an ASM routine simply by reading the comments; taking all the joy and "magic" out of the experience.
If simplicity/optimization was the purpose then I would of used a LUT (Look Up table) and a single MOV instruction.
I don't see how adding three jumps where there was only one can simplify anything. But removing the loop counter does make sense. The most efficient/simplest algorithmic method would then be...
SHL / LABEL / ADC / SHL / JNZ LABEL ?
Empirically speaking, MOV Reg, 0 vs XOR Reg, Reg, is one of the more useless optimization techniques for GPRs (general purpose registers). But with XMMx registers it can actually make a difference in timing. Some people also try to argue that using XOR obfuscates the code and can introduce errors like XOR CX, DX looking similar to XOR DX, DX etc etc but I'm not one of those people.
Also, the "obviousness" of comments is rather subjective. How asinine it would be if you could trace an ASM routine simply by reading the comments; taking all the joy and "magic" out of the experience.