Assuming that I only use UPPERCASE strings. What would be the fastest algoritm to search a list of lets say 5000 upercased strings for an exact match??
any links would be appreciated.
any links would be appreciated.
boyer moore??? what about a simple cmpsb...
.DATA
str1 db "tEST1",0
str2 db "TEST1",0
cap db "example",0
msg db "not equal",0
.CODE
start: mov eax, offset msg
mov edi, offset str1
mov esi, offset str2
mov ecx, 5
repz cmpsb
jnz @F
add eax, 4
@@: invoke MessageBox,0,eax,addr cap,0
If your goal is speed then no cmpsb - use BM.
First let me say that BM is flat out fast fast fast. But it's a tool that's best in a niche. That niche is for an exact string searchs with a single pattern on a large-ish file.
Problem here is that if you're looking for many patterns against big text you can sometimes do a bit better using a FSM variant of Knuth-Morris-Pratt rather than passing over the file many times. Yes KMP is slower if your a doing single pattern search but in general the benefits of the extra BM skips will diminish as you add more patterns to the mix. A properly designed KMP-FSM will only have to pass over the file once & you can make it so that you only look at any text char in the file being searched no more than once.
Damn! someone at work walked off with the library book on string searching algos! I'll have to get back to you on the details when I can look at my own books. I believe(?) that you can do this with a BM-FSM but that there are no benefits for the added overhead. But I'd better look it up & be sure here.
Problem here is that if you're looking for many patterns against big text you can sometimes do a bit better using a FSM variant of Knuth-Morris-Pratt rather than passing over the file many times. Yes KMP is slower if your a doing single pattern search but in general the benefits of the extra BM skips will diminish as you add more patterns to the mix. A properly designed KMP-FSM will only have to pass over the file once & you can make it so that you only look at any text char in the file being searched no more than once.
Damn! someone at work walked off with the library book on string searching algos! I'll have to get back to you on the details when I can look at my own books. I believe(?) that you can do this with a BM-FSM but that there are no benefits for the added overhead. But I'd better look it up & be sure here.
hm i never used one of those algoritms... i looked
at the source it seemed very big to me so i can't
anylyse or understand it in 5 minutes but ~why~ is
it faster? i don't know if you mean true pattern matching
or simple fixed string matching? whats the technique
behind this method?
at the source it seemed very big to me so i can't
anylyse or understand it in 5 minutes but ~why~ is
it faster? i don't know if you mean true pattern matching
or simple fixed string matching? whats the technique
behind this method?
oops sorry i did'nt saw hutch's explanation... will
read it now and hope i'm able to understand it :)
read it now and hope i'm able to understand it :)
use bm :)
-but only if the list of strings is linked togehter in
something like a buffer or file... if not you know
that the base to search will always be 0 so a
simple loop would be better (btw yeah cmpsb
is slower than cmp...)
-but only if the list of strings is linked togehter in
something like a buffer or file... if not you know
that the base to search will always be 0 so a
simple loop would be better (btw yeah cmpsb
is slower than cmp...)