Hi,

we needed a string matching search algo case-NON-sensitive.
Couldn't find a fast one, so this is what we got.

It will recognize 'bAZiK' as well as 'BAZIK' or 'Bazik', but not 'Penguin' unluckily.


Comments and optimizations are very welcomed.

;

;Destroys EDX
;Case-NON-sensitive searcher
;Entra con CX=length to search; SI=source to search (ptr)
;DI=palabra a buscar (db 'word'), DX=length of this word (p.ej 4)
;

_start:
mov bl,byte ptr[edi] ;en BL tengo 'i'
mov bh,bl
sub bh,20h ;en BH tengo 'I'
_start_2:
lodsb
;en AL tengo '?'
cmp al,bl
jz _match
cmp al,bh
jz _match
loop _start_2
jmp _match_ok ;El caller tiene que comprobar di cx=0
_match:
push esi ;por si acaso
push ebx ;idem
push edi
push edx
_match_2:
dec edx ;una letra menos
jz _match_ok
inc edi
mov bl,byte ptr[edi] ;en dx pasa el n?mero de letras, 6 en invoke
mov bh,bl
sub bh,20h
lodsb
cmp al,bl
jz _match_2
cmp al,bh
jz _match_2
pop edx
pop edi
pop ebx
pop esi
jmp _start
_match_ok:
pop edx
sub ecx,edx ;disminuye el counter
pop edi
pop ebx
pop eax
;edi ha aumentado
ret
;Al salir ESI mantiene el valor para seguir desde ah?;
;EDX y EDI hay que volver a ponerlos (por si se quiere buscar otra palabra)


To test how fast it is (and maybe include your own):
Posted on 2002-10-31 12:37:55 by slop
:grin:
Posted on 2002-10-31 12:46:41 by bazik
Make copy of pattern upper case just once in start.
then in while comparing make CURRENT compared bytes of source upper case.
The logic will make algo several times faster.
Posted on 2002-10-31 19:04:27 by The Svin
Sloppy, be careful when you're mapping uppercase/lowercase. You might experience a bug when using punctuation. Ex. a "{" (ascii 123) will change to a "[" (ascii 91) if you blindly sub 32 from the ascii code. Maybe this isn't of concern, depending on your needs, but the possibility is there.

--Chorus
Posted on 2002-10-31 20:13:14 by chorus
chorus,

you're right, this is used in a list of chars and colons, so it works. To handle punctuation, there should be more checks made.

The Svin,

but then I'll have to make all the 'target' UPPERCASE, but in this case I can not change it, only have read access to the text.
Thanx for the comments: if you notice in the test example I am using your MACROS to profile it.

Tnx both.
Posted on 2002-11-01 06:56:44 by slop
Sorry, there's a tiny bug.
I couldn't come before. Replace:
	sub bh,20h  	    ;en BH tengo 'I'

_start_2:
lodsb
;en AL tengo '?'
cmp al,bl
jz _match
cmp al,bh
jz _match
loop _start_2
jmp _match_ok <----------replace with a ret this instruction


Now I think it's Ok.

If you test it, don?t forget to replace the source also. The bug doesn?t show up there because it ends somewhere else, but with in real life it'd have showed up quickly.

TIA,
sloppy
Posted on 2002-11-05 07:14:25 by slop