hi all,
I got a problem when I the last 4 bits of a register,

let say,
al=4d ; "M", 01001101b
now how do I clear the most sig part to leave al=00001101b????

and one more question,

how do I swap them ????

yanda
Posted on 2003-06-29 02:39:25 by Yanda
al=4d ; "M", 01001101b

start:

AND al,00001111b ;al==00001101
shl al,4 ;al==1101 0000 <-- ------ or maby that was different queston :tongue:

end start

;swap 4 bits
mov al, 01001101b
shl eax,4
or al,ah ;al====11010100
Posted on 2003-06-29 02:47:09 by AceEmbler
and al,0fh

to swap, try:
ror al, 4

Anyway Ace,
When you shift left, the data is lost in the upper al is lost.
Posted on 2003-06-29 02:54:38 by roticv

Anyway Ace,
When you shift left, the data is lost in the upper al is lost.

I know but i dont see a problem.

BTW what is faster ??

{
mov al, 01001101b
shl eax,4
or al,ah
}

or

ror al,4
Posted on 2003-06-29 02:59:18 by AceEmbler
thankx guys,,

y use AND? not OR or XOR???
Posted on 2003-06-29 04:39:53 by Yanda
thankx guys,,

y use AND? not OR or XOR???

This proves that your knowledge on binary bitwise operator is very shallow.
Remember that
1 and 0 = 0
0 and 1 = 0
0 and 0 = 0
1 and 1 = 1

Thus with the correct hex value, AND can be used to clear some particular bits.

OR sometimes act like addition. It uses is to set a particular bit (opposite of AND).

For XOR, if you xor something with something, you xor it again, you will get back the original value (Thus it is quite useful in enryption) Furthermore if you xor a value by itself you will get 0.
Posted on 2003-06-29 06:32:26 by roticv
ror much better in size and clock cycle.
Posted on 2003-06-29 10:09:51 by realvampire
Yanda, you might want to play around w/ this a bit

(page includes a lousy bmp image of tool )
will fix to jpg or something later
Posted on 2003-06-29 10:35:18 by Brad

ror much better in size and clock cycle.

I cant belive it im using only 2 simple instructions

shl eax,4
or al,ah

just cant be slower than

ror al,4
Posted on 2003-06-29 13:06:56 by AceEmbler
"shl eax, 4/ or al, ah"

This will be VERY slow on a PPro, P2, P3, P4, Athlon, or Duron. It'll cause a partial register stall, and so force the pipeline to flush while it re-executes the instructions.

Mirno
Posted on 2003-06-29 15:03:28 by Mirno
I agree with Mirno,

Im not very well versed in optomizations, but my rule of thumb is stay in 32bit as long as you have to, simply because the pipeline is designed to primarily operate on 32 bits.

Perhaps something like (Q1):

mov edx, 'M'
shl edx, 4
mov zx eax, dl

EDX = 00000000 00000000 00000000 01001101
EDX = 00000000 00000000 00000100 11010000
EAX = 00000000 00000000 00000000 11010000

(Q2):

mov edx, 'M'
shl edx, 4
mov zx eax, dl
shr edx, 8
or eax, edx

EDX = 00000000 00000000 00000000 01001101
EDX = 00000000 00000000 00000100 11010000
EAX = 00000000 00000000 00000000 11010000
EDX = 00000000 00000000 00000000 00000100
EAX = 00000000 00000000 00000000 11010100

Probably not the most optomal, im not trying to show this, but it is all 32 bit (perhaps exception to movzx ~ This could be replaced with an AND tho)

Anywho, just my thoughts.
:NaN:
Posted on 2003-07-01 13:14:25 by NaN

For XOR, if you xor something with something, you xor it again, you will get back the original value (Thus it is quite useful in enryption) Furthermore if you xor a value by itself you will get 0.
XOR is useful for complementing bits. (Bit level NOT.)
Posted on 2003-07-01 14:57:28 by tenkey
Ya the basic rule of thumb properties are:

AND to filter/mask bits
AND to clear bits
OR to set bits
XOR to toggle bits

:NaN:
Posted on 2003-07-02 17:00:16 by NaN

"shl eax, 4/ or al, ah"

This will be VERY slow on a PPro, P2, P3, P4, Athlon, or Duron. It'll cause a partial register stall, and so force the pipeline to flush while it re-executes the instructions.

Mirno

Let me add a little bit.
One needs to read Mirno's post with the following at the end:
"when one adds something reading eax or ax right after or al,ah."

When one reads al or ah, no partial register stall happens. So, as an artificial example, if I do
``````
shl eax,4
or al,ah
cmp al,cl
mov eax,0
je wheeeee
...
``````

then there is no partial register stall in this code.

BTW, the "flushed pipeline" is not the decoding pipeline but the u-op retirement. (Probably all of you already know this. But let me clarify. :) ) The CPU waits for all u-op's to retire so that it knows what eax is. So it is cheaper than something causing the decoding pipeline to flush, e.g. branch misprediction.
Posted on 2003-07-02 17:35:58 by Starless