I had an idea for a way to search strings that *might* be faster. I don't know for sure. I would like to know what you all think of this checksum method. I have tested the algo and it works as expected, but I haven't profiled it. I want to hear more ideas first.

Thanks. ;)

[size=12]InString2 proc uses ebx edi esi, lpSource:DWORD, lpPattern:DWORD[/size]

[size=9]
;-------------------------------------------------------------
; by Iblis
;
; InString2 searches for the first occurance of the string
; pointed to by lpPattern in the string pointed to by lpSource
; and returns the (1) based index into the character array.
;
; This version of InString does not use a starting position,
; so calculating the position is the responsibility of the
; user, although adding this functionality wouldn't be too
; difficult. The main purpose of this is to play with an idea
; I had about string searching. Instead of doing a straight
; byte scan, the routine creates a simple checksum for the
; pattern string and compares it with the source string. If
; the checksums match, one last check is made to see if the
; strings match byte for byte. The idea is that perhaps this
; might speed the search up a bit.
;
; Return value:
; On success, the (1) based index of the search string is
; returned. If there is no match, the routine returns (0).
; Returns -1 on error. Possible causes for error include:
; -If the pattern string is longer than the source.
; -If the pattern string has zero length.
;-------------------------------------------------------------[/size]
[size=12]
mov esi, lpSource
mov edi, lpPattern
xor eax, eax
xor edx, edx
@chksm: movzx ebx, byte ptr [esi]
inc esi
movzx ecx, byte ptr [edi]
inc edi
lea eax, [eax + ecx]
lea edx, [edx + ebx]
dec bl
sub bl, 0FFh
not ecx
inc ecx
ja @error
jnz @chksm
sub edx, ebx
lea ecx, [edi - 1]
sub ecx, lpPattern
jz @error
sub esi, ecx
dec esi
@findm: cmp eax, edx
je @match
@findn: movzx ebx, byte ptr [esi]
sub edx, ebx
movzx ebx, byte ptr [esi + ecx]
lea edx, [edx + ebx]
inc esi
or ebx, ebx
jnz @findm
xor eax, eax
jmp @done
@match: mov edi, lpPattern
mov ebx, ecx
repe cmpsb
sub esi, ebx
add esi, ecx
or ecx, ecx
mov ecx, ebx
jnz @findn
lea eax, [esi + 1]
sub eax, lpSource
jmp @done
@error: mov eax, -1
@done: ret

InString2 endp[/size]
Posted on 2002-08-13 13:22:51 by iblis
Here is test.
But I used a little bit different InString then in m32lib.
Yet it still just modification of Steve's original proc.
Posted on 2002-08-13 15:13:26 by The Svin
Svin, thanks! :)

I see that my algo is horribly slow. :(
I am not much of an optimizer.

My point in posting that was to get a discussion going about the method used. It doesn't behave like normal byte scanners. Basically I want to know if the checksum method has the potential to be fast.


Ahh well, I try. ;)
Posted on 2002-08-13 15:34:25 by iblis
the svin .
can your version vs my version?

but notice it uses strlen .
so maybe add to parameters to both algo . and remove the strlen for the test ?


bye

thanks

eko

EDIT:
iblis . how does your method work ?
i mean you created checksum for the pattern. but what about the all string ?
Posted on 2002-08-13 17:33:11 by eko
eko,
Yours in most cases ~ 0,5% faster.
Good work!
iblis, it's very good that you got ideas and try it.
It's much harder to create ideas and check them then
critisize others.

Some people recently tought me that critisize others ideas is more profitable - you gave nothing exept for OTHER people code and almost always win, 'cause it's not so easy to improve some hardly inspected algos and creators mostly lost in their attempts (to win at the end if they are tough enogh).

So try your ideas, give them to other - they beat them most of the time :)
But at the end you may get lucky and find something better then exist to give it out for free :)
Good luck to you.

I stopped doing it.
Posted on 2002-08-13 18:42:28 by The Svin
I forgot testing prog :)
Posted on 2002-08-13 18:44:00 by The Svin
iblis your idea can be good . maybe we can optimize the code alittle



chksm: movzx ebx, byte ptr [esi]
inc esi
movzx ecx, byte ptr [edi]
inc edi
lea eax, [eax + ecx]
lea edx, [edx + ebx]

dec bl
sub bl, 0FFh
not ecx
inc ecx
ja @error
jnz @chksm

few questions
1.why do you do lea edx, and not add edx,ebx?
2. doesnt sub bl,-1 is the same as inc bl , so dec bl sub bl,0ffh cancel each other

i think the order of the command make a diffrent
like you can do
dec bl
not ecx
sub bl,-1
inc ecx ; should be faster than the orginal order because there is no dependantcy(spelling! ...how to write this word ) this way

bye

eko

EDIT:
the svin . - can you post your version of instring
and why did you "stop doing it "?
Posted on 2002-08-13 19:14:11 by eko
Svin,
Don't give up! Most people here appreciate what you have to offer to the community. I for one enjoy reading your informational posts. ;)


Eko, thanks for your interest. ;)


1.why do you do lea edx, and not add edx,ebx?


I don't know... like I said before, I'm not a very experienced optimizer. For some reason I thought that LEA is generally faster than ADD. Feel free to let me know the truth on this.


2. doesnt sub bl,-1 is the same as inc bl , so dec bl sub bl,0ffh cancel each other


Yes, but my reasons for this was to set the flags for my two branches. (Sorry if it seems convoluted) I will comment the code here so it makes more sense:

i think the order of the command make a diffrent [...] should be faster


Interesting... why is this?

dependantcy(spelling! ...how to write this word )


"Dependency"



Thanks!
-ib
Posted on 2002-08-13 19:53:41 by iblis
New idea for InString



chksm: movzx ebx, byte ptr [esi]
inc esi
movzx ecx, byte ptr [edi]
inc edi
lea eax, [eax + ecx]
lea edx, [edx + ebx]

dec bl
sub bl, 0FFh
not ecx
inc ecx
ja @error
jnz @chksm



if its the end of the pattern . ecx=0 . but it still add ebx to edx
shouldnt the check be before that?

what about this one



xor ebx,ebx
xor ecx,ecx
@chksm: mov bl, byte ptr [esi]
inc esi
cmp bl,0 ;ebx ; or test bl,bl
mov cl, byte ptr [edi]
jz @error ; we can let the user do this checking ( the pattern smaller than the source)
inc edi
add eax,ecx
add edx,ebx
or cl,cl ;ecx
jnz @chksm


maybe we can do the chksum instead of adding byte every time . do it with dwords.

i have few ideas for optimize the all code.. but i must have some sleep .
bye

eko

p.s

now . when i understand the algo completly . it looks like good idea
the algo will be fast if most of the source be diffrent bytes . (i think)
Posted on 2002-08-13 20:58:51 by eko
New idea for InString
If anyone has the time, the last version of InString that I posted in the MASM32 forum seems to have got past the odd problems it had while maintaining the very short instruction count in its loop code.

What I was after with the design was a 4 instruction length main loop and a fast exit for the subloop that does the matching. It contains the modification that EKO designed which performs the matching loop in reverse which in turn reduced the overhead to set up the matching loop by a number of instructions.

Alex has a modification on the prologue and epilogue code that reduced the instruction count to set up the original loop code that I have not had time to integrate into the design and there is a good chance that if it is done properly that it will produce a cleaner cache which will in turn make the loop code run faster.

Now not withstanding the value of getting a byte scanner of this type faster, its worth keeping in mind that in many instances where the search pattern is over about 6 characters and the data being searched is more than a few K in length, a Boyer Moore algo will literally rip the titz of a byte scanner because they have a superior logic that does far fewer tests for matches than a byte scanner.

Regards,

hutch@movsd.com
Posted on 2002-08-13 22:14:31 by hutch--
New idea for InString
Hi ;)

Hutch,
By no means do I want to compete with nor replace masm.lib's InString(). I was simply offering up this alternative method for discussion. All the InString stuff that was posted months ago is more than satisfactory.


Eko and whomever else cares,

Here's the gist of how the algorithm works:



The idea behind this method is to use a checksum value for string comparison. To help visualize, pretend we start with two strings:

lpSource = 'All your base are belong to us'
lpPattrn = 'belong'


The first task is calculating the checksum for lpPattern, which is just a simple additive method. Anything more complex might slow things down too much. We want to take each char in the pattern and add it up.

[b][size=12]  lpPattrn = |'b' + 'e' + 'l' + 'o' + 'n' + 'g'|

| |
chk_pat = \[color=blue]62h + 65h + 6ch + 6fh + 6eh + 67h[/color]/ = [color=green]277h[/color][/size][/b]



Now that we have the checksum, we save it for later to do comparison matches with.
Since there is no sense in wasting loop cycles, we can use this to also get the checksum for the first n characters in lpSource.

[b][size=12]  lpSource = |'A' + 'l' + 'l' + ' ' + 'y' + 'o'|+ 'u' + 'r' + ' ' + 'b' + ' '...

| |
chk_src = \[color=blue]41h + 6ch + 6ch + 20h + 79h + 6fh[/color]/ = [color=red]221h[/color][/size][/b]




Now for the main loop. In the loop we compare the checksum for each section of lpSource with constant chk_pat. If the values are equal, that means there's a fairly good chance that we've got a match. Afterwards we send it to a short repe cmpsb-type routine for the final verdict.

[b][size=12]  1. lpSource = A l l   y o [color=grey]u r   b a s e   a r e   b...[/color]

chk_src =| [color=red]221h[/color] |
chk_pat = [color=green]277h[/color]

2. No match. So now adjust lpSource + 1, and create the new checksum by
subtracting the first byte, and adding the new byte.

3. lpSource = [color=grey]A[/color] l l y o u [color=grey]r b a s e a r e b...[/color]
chk_src = | [color=red]255h[/color] | = ( 221h - 'A' + 'u' )
chk_pat = [color=green]277h[/color][/size][/b]



We keep going until either chk_src = chk_pat, or we reach the end of lpSource (get a null char). If a checksum match is made, one last attempt to verify the match is made by a small repe cmpsb loop.



This is basically what my algorithm does. If you want I can better comment the code, but I think it's pretty easy to figure out from here. What I am interested in is not exactly optimization, but rather to discuss applying such a method to string searching and other such algorithms.

Thanks! :)
Posted on 2002-08-14 03:33:16 by iblis
New idea for InString
i understood how the algo works; ]
..

lets compare this method to byte check string ( like instring)

after you have calculated checksum of the pattern
you compare every

add next char . remove the first and do compare

in byte check . you need to compare every byte in the pattern .

so your method . in my opinion can be alot faster for big patterns.

maybe better checksum , not add ,
maybe xor , and then when we have match on the chechsum , the probability that we have found the correct place gets bigger ( i think)

bye

eko
Posted on 2002-08-14 09:34:55 by eko
New idea for InString
Yes, I know using add is not the best way to get a string signature. It's possible for the algo to think "string" = "grints" which is why it needs to verify the matches.

I played with different ideas on this, and in the end I went with a simple add because doing more complex checksum generation might take too much time. I tried xor instead of add, but all the checksums generated are 8 bit... not unique enough, imo.
Posted on 2002-08-14 09:45:23 by iblis
New idea for InString
iblis,

No problems, I am more than glad to see you working on original ideas. I simply pointed out that the version of InString had been fixed and tweaked by both EKO and myself to try and get its speed up a bit.

Regards,

hutch@movsd.com
Posted on 2002-08-14 10:04:47 by hutch--
New idea for InString
Maybe doing add and sub together

add edx,1st byte
sub edx,2nd byte
add edx,3rd byte
.
.
.
.
and so on
Posted on 2002-08-14 10:08:33 by eko