Hi guys,
  I need a little help (AGAIN :P).

  is there  fast method for forcing a byte to lower case regardless of wether it is already.
ie. a --> a
    A---> a

what im trying to do is write a case insensitive string compare routine and the biggest bottleneck as far as i can see is the two comparisions (.if al<=Z && al>=A) .This also has to be performed on both sources!
  Id appreciate any ptrs!  :D
Posted on 2006-07-15 00:04:32 by asmrixstar
Unless you want to use an API call (like _stricmp), you've pretty much got it down (for ASCII at least).


.if al<=Z || al>=A
  add al, 20h ;convert to ASCII lower-case equivalent...
.end if
Posted on 2006-07-15 01:54:49 by SpooK

thx spook.
Posted on 2006-07-15 11:38:12 by asmrixstar
To SpooK,
shouldn't that be
.if al >= 'A' && al <= 'Z'
  add al, 20h  ;"or al,20h" will also do :) (I think "or" faster than "add" ???)
.endif
Posted on 2006-07-15 12:18:44 by shantanu_gadgil

To SpooK,
shouldn't that be
.if al >= 'A' && al <= 'Z'
  add al, 20h  ;"or al,20h" will also do :) (I think "or" faster than "add" ???)
.endif




Could be anything more efficient, just want to throw out a quick example.
Posted on 2006-07-15 19:16:42 by SpooK
If you care to examine the BINARY for an ascii character value, you will find that bit 5 determines upper/lower case.

a = 01100001
A = 01000001

What you want to do is mask out bit 5 before performing your comparison of the byte values.
If you are clever, you can mask and compare 4 bytes at a time.

Posted on 2006-07-16 03:54:44 by Homer
Could be anything more efficient, just want to throw out a quick example.

Umm....just want to verify...the comparison you have done...
.if al<=Z || al>=A

Isn't this just wrong!!! (Assumed that when you say Z you want to say 'Z' (I'll give that) )

But a value like 20h (space) will get through this OR condition, wouldn't it? (20h less than 'Z' YES, other part of OR not necessary 1 OR 0 is 1 (TRUE) ...isn't it?)

or even a value which is ASCII 96 (one less than 'a') as the entire OR condition would succeed!!!
(96 NOT less than 'Z' but 96 greater than 'A' again ... 0 OR 1 is 1)
:shock: :shock: :shock:
Posted on 2006-07-16 04:39:17 by shantanu_gadgil
Sorry, cut n' pasted asmrixstar's example and filled in the blank for him. Probably should have checked it first.

At any rate, listen to Homer, his answer was more thorough.
Posted on 2006-07-16 04:59:00 by SpooK
copied directly from my lbrary :)
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

StrCmpi proc pStr1:DWORD,pStr2:DWORD
push edi
push esi
or al,-1
mov edi,[8+esp+1*4];str1
mov esi,[8+esp+2*4];str2
@@:
test al,al
jz @F
mov al,
mov dl,
inc esi
inc edi
cmp dl,al
je @B
sub al,'A'
cmp al,'Z'-'A'+1
sbb cl,cl
and cl,'a'-'A'
add al,cl
add al,'A'
sub dl,'A'
cmp dl,'Z'-'A'+1
sbb cl,cl
and cl,'a'-'A'
add dl,cl
add dl,'A'
cmp dl,al
je @B
sbb al,al
sbb al,-1
@@:
movsx eax,al
pop esi
pop edi
ret 2*4
StrCmpi endp
OPTION PROLOGUE:PROLOGUEDEF
OPTION EPILOGUE:EPILOGUEDEF
Posted on 2006-07-16 07:04:41 by drizz
If you consider speed, I will recommed you to take a look Boyer-Moore string searching algorithm
http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm
Posted on 2006-07-16 07:32:54 by Dite

If you consider speed, I will recommed you to take a look Boyer-Moore string searching algorithm

It depends on string lengths whether BM will be faster, but if you have a lot of data to search through... sure. This thread was about string comparing though, not string searching.
Posted on 2006-07-16 08:48:19 by f0dder
wow big reponse ,

And the winner for the most helpful ....HOMER :P

yeah thats exacly what i was looking for thx

Thx to all... :)
Posted on 2006-07-17 18:10:52 by asmrixstar
Just remember that Homer's method will only work for English text.
Posted on 2006-07-18 06:22:56 by f0dder
Actually it won't work. It will work only with (1) English (2) letters. Both of these 2 conditions must be met. Such method wouldn't find "f0dder" if you wrote "F0dder". So you must first check out wheter the sign in question is indeed a letter. Otherwise you would wipe out the spaces (0x20 and 0xDF = 0) and in result find nothing. English string comparisons can be neatly written using MMX (because the comparison instructions zero-out non-matching bytes). Without MMX you must use this "if al<=Z || al>=A" thing (and it still allows you to find English strings, only).


If you consider speed, I will recommed you to take a look Boyer-Moore

If he considers speed, then I'd rather recommend "turbo boyer-moore" (~2/3 of the original execution time), but that would require him to 'prepare' both strings as this algo is case-sensitive, wouldn't it?
Posted on 2006-07-18 06:46:03 by ti_mo_n
Umm....what _are_ you saying? I was pointing out the fact that the "OR condition" (||) is not correct and should be an "AND condition" (&&), thats all!!! :)

SpooK has even acknowledged the same...so what are you saying? :confused: :(

Regards,
Shantanu
Posted on 2006-07-18 07:17:02 by shantanu_gadgil
I was referring to the Homer's method of simply 'anding' the bytes and comparing them. Your method is correct, from what I see, and it's exactly what I was suggesting ^^'
Posted on 2006-07-18 07:29:56 by ti_mo_n