hii .. .
i need the fastest way to check if a byte is a letter , i mean if he between A to Z or a to z
i'm sure there is a faster way than just cmp both cases
thanks
bye
eko
EDIT:
;two ways based on maverick algo
i need the fastest way to check if a byte is a letter , i mean if he between A to Z or a to z
i'm sure there is a faster way than just cmp both cases
thanks
bye
eko
EDIT:
;two ways based on maverick algo
lea ebx,[ecx-'A']
cmp ebx,'Z'-'A'
jc label
lea ebx,[ecx-'a']
cmp ebx,'z'-'a'
jc label
lea ebx,[ecx-'A']
cmp ebx,'z'-'A'
jc label
lea ebx,[ecx-'a'-1]
cmp ebx,'a'-1-'Z'+1
jnc label
If you don't care about the case sensitivity of the result (and don't mind trashing non characters) you can avoid the second comparison:
Before the comparison "or ecx, 32", or "and ecx, NOT(32)"
Mirno
Before the comparison "or ecx, 32", or "and ecx, NOT(32)"
Mirno
.data
letmask dd 00000000h, 00000000h, 07FFFFFEh, 07FFFFFEh, 00000000h, 00000000h, 00000000h, 00000000h
.code
bt letmask,ecx
jnc not letter
btw: on the board the algo you mentioned above was firstly submitted by Nexo.
VERY VERY VERY VERY NICE !
:alright:
thanks
and i'm sorry i didnt know . so Nexo you should have the credit
bye
eko
:alright:
thanks
btw: on the board the algo you mentioned above was firstly submitted by Nexo.
and i'm sorry i didnt know . so Nexo you should have the credit
bye
eko
Check out BitStrings utility.
It creates for you control blocks for 2^256 different condition cases.
Disign to work easily with symbols (both OEM consol and ANSI) and numbers.
I just putted your conditions inside it and copy paste result into the post.
It creates for you control blocks for 2^256 different condition cases.
Disign to work easily with symbols (both OEM consol and ANSI) and numbers.
I just putted your conditions inside it and copy paste result into the post.
Alex,
The code you posted looks very efficient but the technical data says that BT is a very slow instruction, have you ever bothered to benchmark it against an algo that uses comparisons ?
Regards,
hutch@movsd.com
The code you posted looks very efficient but the technical data says that BT is a very slow instruction, have you ever bothered to benchmark it against an algo that uses comparisons ?
Regards,
hutch@movsd.com
but the technical data says that BT is a very slow instruction
My technical data doesn't use words "fast" or "slow" regarding speed of instructions, it uses numbers :)
And it says that it is 4 to 8 clocks for bt in <=PMMX
(depends on memory or register used)
and 1mops to 1+6+1mops for > PMMX.
Then to calculate deviation of two methods we need analyze what data would be passed to it. In different
cases bt might be from 7 clocks per iteration slower to 37 clocks per iteration faster.
Note: I'm just playing around. By no means do I recommend this, nor would I ever use this. ;)
; in - cl = byte to check
; out - eax = 1 (is alpha)
; eax = 0 (is !alpha)
[b]isAlpha proc
push offset label2
movzx edx, cl
add edx, offset label1
xor eax, eax
jmp edx
label1: db 65 dup(C3h)
db 26 dup(F9h)
db 6 dup(C3h)
db 26 dup(F9h)
db 133 dup(C3h)
label2: jnc label3
inc eax
label3: ret
isAlpha endp[/b]
Alex,
This is Agner Fog's comments on BT in his optimisation manual.
26.4 Bit test (all processors)
BT, BTC, BTR, and BTS instructions should preferably be replaced by instructions like TEST, AND, OR, XOR, or shifts on PPlain and PMMX. On PPro, PII and PIII, bit tests with a memory operand should be avoided.
Regards,
hutch@movsd.com
This is Agner Fog's comments on BT in his optimisation manual.
26.4 Bit test (all processors)
BT, BTC, BTR, and BTS instructions should preferably be replaced by instructions like TEST, AND, OR, XOR, or shifts on PPlain and PMMX. On PPro, PII and PIII, bit tests with a memory operand should be avoided.
Regards,
hutch@movsd.com
I was looking at the algo posted at the top of the thread (which doesn't seem work for letters Z or z btw), and an idea came to me. I got rid of 1 jump and it works... with Z even. ;)
Tell me what you think.
Tell me what you think.
;; ecx = byte to check if it is alpha
and cl, 0DFh
lea eax, [ecx-'A']
cmp eax, 'Z' - 'A' + 1
jc _isAlpha
Oh yeah, and would a sub be faster than a cmp? I have no idea.
Also, maybe it would be more efficient to use ecx instead of eax since the first op destroys ecx anyway.
Maybe something like this:
Also, maybe it would be more efficient to use ecx instead of eax since the first op destroys ecx anyway.
Maybe something like this:
and cl, 0DFh
sub cl, 41h
sub cl, 1Ah
jc _isAlpha
BT, BTC, BTR, and BTS instructions should preferably be replaced by instructions like TEST, AND, OR, XOR, or shifts on PPlain and PMMX. On PPro, PII and PIII, bit tests with a memory operand should be avoided.
I doubt if TEST, AND, OR, XOR or shifts will be faster than bt on 255 bit string
but i'll check it anyway
iblis : NICE! according to my tests your way is faster .
can anyone else do the comparing between the-svin way and iblis way (lastone)
bye
eko
Eko,
I ran a test using QueryPerformanceCounter setting process and thread priorities to REAL_TIME, and the speeds varied depending on what char was in ecx...
For non-alphas, Svin's algo performed differently than with alphas. Non-alphas took on average twice as long to process for some reason. Anybody know why? The other algo wasn't affected.
Regarding alignment, Svin's algo performed at greatly different speeds depending on whether or not the code and the Letter Mask was aligned. This didn't happen with the other algo. At optimal alignment, Svin's algo performed slightly better than the other one, but only for some letters. The algo I posted was the most stable across the board.
This was on an AMD 1.4ghz, I'm sure timings will be different on different processors. Both algos seem to be pretty damn fast though.
I ran a test using QueryPerformanceCounter setting process and thread priorities to REAL_TIME, and the speeds varied depending on what char was in ecx...
For non-alphas, Svin's algo performed differently than with alphas. Non-alphas took on average twice as long to process for some reason. Anybody know why? The other algo wasn't affected.
Regarding alignment, Svin's algo performed at greatly different speeds depending on whether or not the code and the Letter Mask was aligned. This didn't happen with the other algo. At optimal alignment, Svin's algo performed slightly better than the other one, but only for some letters. The algo I posted was the most stable across the board.
This was on an AMD 1.4ghz, I'm sure timings will be different on different processors. Both algos seem to be pretty damn fast though.
May I point out the design or layout of the encoding of various characters
to their bit patterns.
Please note that space our first non control is 32 dec 20 hex. Why?
This is so that all non control characters would have bit 7 turned on.
The hardware designers made it that way for speed.
Therefore, if the real question is this byte a "printable character" bit 7 must turn on.
Bit 8 must also be turned off.
Now, if this is the case can some find a fast test for the logic of bit 8 off bit 7 on?
Note originally 127 was del.
to their bit patterns.
Please note that space our first non control is 32 dec 20 hex. Why?
This is so that all non control characters would have bit 7 turned on.
The hardware designers made it that way for speed.
Therefore, if the real question is this byte a "printable character" bit 7 must turn on.
Bit 8 must also be turned off.
Now, if this is the case can some find a fast test for the logic of bit 8 off bit 7 on?
Note originally 127 was del.
Now, if this is the case can some find a fast test for the logic of bit 8 off bit 7 on?
sar al,7 ; OR shr al,7
jna Printable
I tried to find a way to use shl al,1 but could not in a couple minutes.
bitRAKE, I think sar al,7/jna doesn't quite work. Consider if both bits are 0. Then ZF=1 and the jna is satisfied (ZF=1 or CF=1) when it shouldn't be.
I think you need at least three instructions:
I think you need at least three instructions:
sar al,6
dec al ;al will only be 1 if top bit was 0 and next bit was 1
jz Printable
chorus, thanks, I was too quick to reply.
Just use the same principles that we used in IsAlpha()....
sub cl, 40h
sub cl, 40h
jc _isPrintable
How about
/me, making myself look like an idiot. :grin:
bt eax, 7
jnc __bit8isoff
;bit 8 is on
*Might* be fast on newer processors. :)
/me, making myself look like an idiot. :grin:
On this discussion it is possible to write the whole book about check of bits:)
There is one ridiculous test for check speed the code.
CPU - AthlonXP. For the first code it is gained 502clk, for second 256clk ;)
There can be somebody it will check up? :)
There is one ridiculous test for check speed the code.
REPEAT 256
sub cl, 40h
sub cl, 40h
ENDM
and
REPEAT 256
and cl, 0C0h
cmp cl, 40h
ENDM
CPU - AthlonXP. For the first code it is gained 502clk, for second 256clk ;)
There can be somebody it will check up? :)