umberg6007, Svin is correct the first bit of code in that article is just step one in the tutorial - it's not a partial KMP. Your algo and the first in the article are still brute-force algos. It's a good first step - nice conversion. Now, work your way through the rest of the tutorial.

My favorite link: http://www-igm.univ-mlv.fr/~lecroq/string/node8.html
Examples, easy to read C, Java applets, nice explainations, etc...
Posted on 2002-02-28 23:11:38 by bitRAKE
Just curious BitRack,

how do you do the search when calling your procedure?

I haven't look at it that long, but i was hoping for a quick calling:

ie.

invoke KPM, ..., ..., ..., ...

example
Posted on 2002-02-28 23:17:05 by Sliver
It's currently broken, but it will be:
invoke KMP, ADDR FindMe, LenFindMe, ADDR SearchMe, LenSearchMe
Your calling me bitRACK again. :)
How do you like that C code conversion?
Posted on 2002-02-28 23:22:22 by bitRAKE
force of habit :( sorry

Ok now I'll stop for good...

I swear

Sliver

ps. thanks for the quick respond and the link
Posted on 2002-02-28 23:25:13 by Sliver
Sliver, I was thinking of your need for the algo, when I did the conversion. The string length of the FindMe string can be easily calculated within the preprocessing stage, if we know it's a zero terminating string. I left it open to be a binary search and not string specific. Just thought I mention this area of optimization.
Posted on 2002-02-28 23:30:39 by bitRAKE
oops, my mistake :) he! he! he! It's really confusing between mine and the original KMP, there's only a small difference. Ok! Ok! I'll code up one and I'll compare it head to head, with my current algo. Sorry!. Hmm! maybe I'll call it UMB :grin:
Posted on 2002-03-01 00:07:41 by stryker
Since there is not much of a difference between the naive algorithm and the KMP algorithm I added another feature. I'm reading a lot of websites and this is how I understood the algorithm. If you have a string abbabbabaaaaaaababaaa and your looking for a pattern ababaa this is how this algorithm works:



1st:
abbabbabaaaaaaababaaa
aba

2nd:
abbabbabaaaaaaababaaa
aba

3rd:
abbabbabaaaaaaababaaa
abab

4th:
abbabbabaaaaaaababaaa
ab

5th:
abbabbabaaaaaaababaaa
ab

6th:
abbabbabaaaaaaababaaa
ab

7th:
abbabbabaaaaaaababaaa
ab

8th:
abbabbabaaaaaaababaaa
ab

9th:
abbabbabaaaaaaababaaa
ababaa



But hey, I'm not the expert here, I'm just implementing one, So if this is still not KMP, tell me! This is as far as I got.



UKPM PROC USES ebx esi edi srcStr:DWORD, fndStr:DWORD

emms
mov esi, srcStr
xor eax, eax
xor ecx, ecx
xor edx, edx

@@KPMRESETEBX:

xor ebx, ebx

@@KPMRESCAN:

mov edi, fndStr
pxor MM0, MM0

@@KPMSCAN:

inc ecx
mov dl, BYTE PTR [edi]
or edx, edx
jz @@KPMSTOPSCAN
mov al, BYTE PTR [esi+ecx-1]
or eax, eax
jz @@KPMSTOPSCAN
cmp al, dl
jne @@KPMRESETSCAN


mov dl, BYTE PTR [fndStr]
cmp al, dl
jne @@KPMCONTINSCAN
movd MM0, ecx
@@KPMCONTINSCAN:


inc edi
or ebx, ebx
jnz @@KPMSCAN
mov ebx, ecx
jmp @@KPMSCAN

@@KPMRESETSCAN:

or ebx, ebx
jz @@KPMRESCAN
xor eax, eax
movd eax, MM0
or eax, eax
jnz @@KPMCONTINWHEREWELEFTOFF
dec ecx
jmp @@KPMRESETEBX

@@KPMCONTINWHEREWELEFTOFF:

movd ecx, MM0
jmp @@KPMRESETEBX

@@KPMSTOPSCAN:

or edx, edx
jnz @@KPMEXIT
or eax, eax
jz @@KPMEXIT
mov eax, ebx
ret

@@KPMEXIT:

xor eax, eax
ret

UKPM ENDP


BTW, the magic register is ECX. If you look closely I added some code on the inner loop, this is where I know how many bytes to shift. Just copy this procedure to the program I attached on the previous post(kmp.zip)
Posted on 2002-03-01 11:57:19 by stryker
Nope, that isn't KMP. KMP uses a FSM (finite state machine to determine the skip distance). You need to calculate this array before the search. The array should be the size of the string your looking for, not 256 - like in my code above. The elements of the array can be bytes, words, or dwords - I doubt a skip value of more than 254 would be needed in most cases.
Posted on 2002-03-01 12:05:35 by bitRAKE
Ok I'll code up more, but this one is an improvement from the first one. It's much faster and doesn't reset to the first comparison. the MM0 determines our ECX. Thanks!!!
Posted on 2002-03-01 12:08:59 by stryker
Since there is not much of a difference between the naive algorithm and the KMP algorithm I added another feature

:)
umberg6007:
It is just a clinick case :)
Core of KMP is self-analyzing preprocessing routing,
making shift table wich in processing tells the algo about source string based on previous analyse of pattern (since if pattern fit some part of source and we know relation in part of pattern to
rest part pattern through preprocessing we can assume from wich part of pattern we can start further comparing)

if you haven't get it right you still know nothing about KMP.
It is heart on the algo.
umberg6007, don't take it too light, for many good programmers
understanding of KMP seemed sometimes just impossible.

And there is a BIG DIFFERENCE between your algo and KMP.
If it seems to you almost the same it's just sign that you did'n get it yet.
Posted on 2002-03-01 12:20:13 by The Svin
Originally posted by The Svin
Since I have all that Mr. D.Knuth wrote on my bookshelf.


Hi, Svin

do you have the russian translation of Knuth's "Programming Art" ? Is it any good? And does it have a real practical use or just academia oriented? I consider ordering those 3 volumes, but just wondering is it worth to spend almost 50 U$ for all of them, since I preordered some other expensive book about algos "algoritmy; postrojenija i analiz". What you would suggest from practical point of view, thanks.
Posted on 2002-03-01 12:39:19 by ramzez
Ok, I know I'm wrong and KMP uses FSM but before I continue I just want to know if I successfully did the inner skipping iterations. For me it did but I want everybody's opinon. Thanks!!!
Posted on 2002-03-01 13:45:52 by stryker
umberg6007, looks good, but I think you'll have to recode most of it for KMP. KMP is nice because it demostrates a technique that can be use on other problems - much harder problems. :)
Posted on 2002-03-01 13:49:58 by bitRAKE
I'm on my way, I'll be back with the code, I hope this time I'll get it right. :)
Posted on 2002-03-01 13:51:04 by stryker



Hi, Svin

do you have the russian translation of Knuth's "Programming Art" ? Is it any good? And does it have a real practical use or just academia oriented? I consider ordering those 3 volumes, but just wondering is it worth to spend almost 50 U$ for all of them, since I preordered some other expensive book about algos "algoritmy; postrojenija i analiz". What you would suggest from practical point of view, thanks.


Yes, I have both old and new AOP.
The book you metioned is good textbook for beginners.
Of course, for start it may be good read it in prior to AOP.
But they can not be directly compared.
And one thing more I'd recommend refresh your math knowledge
before reading AOP, from my personal point of view AOP is not good as math text book -
though it has a lot of math and very interesting ideas it's hard to
learn 'cause Donald thread readers as already good in math despite his words about simplisity of math topics involved,
IMHO he just underestimate avarage math level of now days programmers :)
The other his book wich can be highly recommended is "Concrete Mathematics" with R.L.Graham and Oern Patashnik.

BTW what is especially interesting for this board AOP is probably the only book of such level where algos done in ASM.
THough it is not x86 asm - Donald invented a new machine and new asm for his book :)
Posted on 2002-03-01 14:19:07 by The Svin
Concrete Mathematics by graham, knuth & patashnik? I tell you, it'll fry your brains :) like it did to mine but don't worry, practicing a lot of problems will help you. He uses MIX assembly language :) You can get it in bundle with his AOCP books at www.amazon.com Offtopic: which volume and page is the KMP algorithm, I tried looking for it but can't find it :(
Posted on 2002-03-01 14:30:16 by stryker
Must admit, my 3 volume set by Knuth is sitting on my bookshelf gathering dust as it rarely addresses what I am after.

I would not be worried if your algo design does not conform to the KPM design, just call it something else but DO keep up your own original ideas and code what you are after.

Its probably worth playing with different types of test data when you get it up to the position where you want to start benchmarking the algo. Be very careful about optimising it for unusual cases at the expense of the type of data you want it to work on.

Regards,

hutch@movsd.com
Posted on 2002-03-01 18:32:51 by hutch--
Since the above is a naive algorithm but has a skipping inner iteration feature, I'll just name it something else but I really want to know how KMP really works, after that, I'll then go to BM, then maybe quick search algorithm, I'm still learning the concept of FSM. The problem I have was I only saw the graphical example but failed to grasp the concept, I'm learning! I'm learning! Just you wait when I get it into working condition :) . Also there is already a C code that is available on this thread, but I don't want to just convert it to ASM but to fully understand how KMP works and implement it through my own understanding or ideas. :) Gimme a few days to finish it. He! He! He!
Posted on 2002-03-01 18:43:32 by stryker
but I really want to know how KMP really works


Now you are talking!
Good luck.
BM will be easier after KMP.
KMP selfanalyze is one of most difficult algo siquence for humane
brain, if you learn to reproduce it in your mind with full understanding many other tasks will be just a piece of cake.
Posted on 2002-03-01 20:23:28 by The Svin