I had a look today at InString from MASM32 7.
And thought that it might be used as example for those
beginners, who just start write their own procs, of how to do very basic
things right way.
The following is code of InString from MASM32 7, with enumurated lines
for easier refering in discussion, and a little bit changed by me version
of it. It will be changed further when I explain how and why it was changed.
The reason I post it is not aim to offer different proc - I didn't change algo
at all and even left register assingmentss, the only reason is to use it as
example to discuss basic and general things wich have place in many
parts of other code. How to implement things in asm when you already know algo.
All changes that I don't discuss in this post you can compare yourself
until I have it done. With next posts I'll cover them all in bytes and ticks
and post new changes.
Changes give advantages of size and speed and yet there are nothing
special, I used very basic things anybody could learn and use easily.

InString - original proc
InString2 - proc where some part of InString changed without changing
algo and logic.



InString proc startpos:DWORD,lpSource:DWORD,lpPattern:DWORD

; ------------------------------------------------------------------
; InString searches for a substring in a larger string and if it is
; found, it returns its position in eax.
;
; It uses a one (1) based character index (1st character is 1,
; 2nd is 2 etc...) for both the "StartPos" parameter and the returned
; character position.
;
; Return Values.
; If the function succeeds, it returns the 1 based index of the start
; of the substring.
; 0 = no match found
; -1 = substring same length or longer than main string
; -2 = "StartPos" parameter out of range (less than 1 or longer than
; main string)
; ------------------------------------------------------------------

LOCAL sLen:DWORD
LOCAL pLen:DWORD
1 push ebx
2 push esi
3 push edi
4 invoke StrLen,lpSource
5 mov sLen, eax ; source length
6 invoke StrLen,lpPattern
7 mov pLen, eax ; pattern length
8 cmp startpos, 1
9 jge @F
10 mov eax, -2
11 jmp isOut ; exit if startpos not 1 or greater
@@:
12 dec startpos ; correct from 1 to 0 based index
13 cmp eax, sLen
14 jl @F
15 mov eax, -1
16 jmp isOut ; exit if pattern longer than source
@@:
17 sub sLen, eax ; don't read past string end
18 inc sLen
19 mov ebx, sLen
20 cmp ebx, startpos
21 jg @F
22 mov eax, -2
23 jmp isOut ; exit if startpos is past end
@@:
24 mov esi, lpSource
25 mov edi, lpPattern
26 mov al, [edi] ; get 1st char in pattern
27 xor ecx, ecx
28 add ecx, startpos ; add starting offset
29 jmp Loop_Start
; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Pre_Loop:
30 pop ecx ; restore ECX
31 inc ecx ; start on next byte
Loop_Start:
32 cmp al, [esi+ecx]
33 je Pre_Sub
34 inc ecx
35 cmp ecx, ebx
36 jne Loop_Start
37 jmp No_Match
Pre_Sub:
38 push ecx ; preserve ECX
39 mov edx, pLen
40 mov edi, lpPattern
Sub_Loop:
41 mov ah, [esi+ecx]
42 cmp ah, [edi]
43 jne Pre_Loop ; jump back on mismatch
44 inc ecx
45 inc edi
46 dec edx
47 jnz Sub_Loop ; fall through if match
48 pop ecx ; restore ECX
49 mov eax, ecx ; match
50 inc eax
51 jmp isOut
; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
No_Match:
52 mov eax, 0
isOut:
53 pop edi
54 pop esi
55 pop ebx
56 ret
InString endp

;========================================
InString2 proc startpos:DWORD,lpSource:DWORD,lpPattern:DWORD
LOCAL pLen:DWORD

1 push ebx
2 push esi
3 push edi
4 mov esi, lpSource ;edi = lpSource
5 mov edi, lpPattern ;esi = lpPattern
6 invoke StrLen,esi
7 mov ebx,eax ;ebx = sLen
8 invoke StrLen,edi
9 mov pLen, eax
10 mov ecx, startpos
11 dec ecx
12 js @errm2
13 sub ebx,eax ;eax=pLen ebx=sLen if pLen > sLen then CF=1
14 inc ebx
15 jc @errm1
16 cmp ebx,ecx ;cmp startof last part +1 with start pos
17 jbe @errm2 ;lenth from start pos of source is less then pattern lenth
18 mov al,[edi]
19 jmp Loop_Start
;@@@@@@@@@@@@@
Pre_Loop:
20 pop ecx
21 inc ecx
Loop_Start:
22 cmp al,[esi+ecx]
23 je Pre_Sub
24 inc ecx
25 cmp ecx,ebx
26 jne Loop_Start
27 xor eax,eax
28 jmp isOut
Pre_Sub:
29 push ecx
30 mov edx,pLen
31 mov edi,lpPattern
Sub_Loop:
32 mov ah,[esi+ecx]
33 cmp ah,[edi]
34 jne Pre_Loop
35 inc ecx
36 inc edi
37 dec edx
38 jnz Sub_Loop

39 pop eax
40 inc eax
isOut:
41 pop edi
42 pop esi
43 pop ebx

44 ret
@errm2:
45 mov eax,-2
46 jmp isOut
@errm1:
47 sbb eax,eax
48 jmp isOut

InString2 endp


Note1:
About error codes and return value.
I didn't change it, but anyway want to say a couple things.
Error codes choosen in the proc are 0,-1,-2.
And if function succeded return value is based1(!) index.

Think of if all error codes were negative (-1,-2,-3)
And return value were zerobased.
Advantages:
1. The original proc wouldn't need the two lines:
12 dec startpos ; correct from 1 to 0 based index
50 inc eax

2. If prog needs just check : "if any error" it could be done at ones
test eax,eax
js error

3.ZeroBased value is much easier and shorter to use
for exaple if you passed pointer in edi to main string then after
return

is pointer to found string.
with 1based indexing you have to address to it as
wich is one byte longer for nothing.
the same with addition, loading etc. you either need longer
instruction with based1 or even additional instruction.

Note2: Strange code.

1.
52 mov eax, 0 ;5 bytes instruction
----------
can be replaced with wellknown
xor eax,eax ;2 bytes
----------
2.
27 xor ecx, ecx
28 add ecx, startpos ; add starting offset
------------
why not just
mov ecx,startpos ?!
3.
48 pop ecx ; restore ECX
49 mov eax, ecx ; match
50 inc eax
51 jmp isOut
-----------------
isOut - is exit of proc and ecx doesn't need anything anymore
for sure it can be replaced with
pop eax
inc eax
jmp isOut.
;-----------------------
to be continued...
Posted on 2002-04-02 16:49:04 by The Svin
Note3: Memory and register usage.
In prologue of the InString there are several places where
some values are being manipulated in memory and after that
are loaded in registers.
A very simple question is:
If we load those values in the registers anyway why don't
we first thing load it to registers and only after that use it.
Look for example at the code in original proc:



4 invoke StrLen,lpSource
5 mov sLen, eax ; source length
6 invoke StrLen,lpPattern
7 mov pLen, eax ; pattern length

.....

24 mov esi, lpSource
25 mov edi, lpPattern


Just put lines 24 25 before 4 5 6 7 and you get



mov esi, lpSource
mov edi, lpPattern
invoke StrLen,esi
mov sLen, eax ; source length
invoke StrLen,edi

push esi is 1 byte and 1 clock instruction
push dword ptr (push lpSource) is 3 bytes and 2 clocks
so with simple cut and paste we get 4 bytes shorter and 2 clocks faster code.

the same you can see here in the original code:



...
8 cmp startpos, 1
...
12 dec startpos ; correct from 1 to 0 based index
...
20 cmp ebx, startpos
... and finally
27 xor ecx, ecx
28 add ecx, startpos ; add starting offset

why don't just mov ecx,startpos and do everything with ecx
each instruction using ecx instead of memory will be faster and shorter.

the same with sLen (variable we have no needs at all):


5 mov sLen, eax ; source length
....
13 cmp eax, sLen
...
17 sub sLen, eax ; don't read past string end
18 inc sLen
... and finally :)
19 mov ebx, sLen

why don't just make line in line 5 mov ebx,eax
and all operations do with ebx
each such use of ebx instead of sLen will be shorter
and faster not saying that one insruction is removed
complitly (#19)

to be continued....
Posted on 2002-04-02 17:20:31 by The Svin
Note 4: Control blocks.
It is a little bit more complex though more interesting and more
usefull.
For a starter let's discuss a simple thing:
size and number of instruction
Have a look on code wich handles checking for errors:


8 cmp startpos, 1
9 jge @F
10 mov eax, -2
11 jmp isOut ; exit if startpos not 1 or greater
@@:
......
13 cmp eax, sLen
14 jl @F
15 mov eax, -1
16 jmp isOut ; exit if pattern longer than source
@@:
........
20 cmp ebx, startpos
21 jg @F
22 mov eax, -2
23 jmp isOut ; exit if startpos is past end
@@:

Logic is
if OK jump over "code for error"
in "code for error" - put error code in eax and jmp out
To do it (apart from comparing itself) code need
3 jcc for OK condition
3 movs for error code
3 jmps out.
Note, please, in to cases we put the same error code -2
What if we change logic to:
If NOT OK jmp to place where we put error code and go out
Then we need just 2 such places for -1 and - 2
thus we can remove 1 jmp and 1 mov,-2.
Which is 7 bytes and 2 clocks.
But there is more important advantage:
We expect more probably normal execution,
so all these jcc in original proc more probably would be
taken. But they all are "first time" (and also only one time)
jccs. That means that dispite of they will be more probably
taken none of them will be predicted and each time (!)
they are taken all pipes will be flashed.
In opposite - if jcc will be taken only in error conditions
with normal execution all of them will be right predicted
(as not taken) and will not cost anything.

the other example of wrong organized contol block here:


35 cmp ecx, ebx
36 jne Loop_Start
37 jmp No_Match
......
50 inc eax
51 jmp isOut
; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
No_Match:
52 mov eax, 0
isOut:

There is no need for No_Match lbl and jumping over it.
it's obvious that 0 error code could be placed after line 36 before
unconditional jmp, and cause it would in such scenario already in eax
jmp may just be done out:


jne Loop_Start
xor eax,eax ;put it here also change to xor - 2 bytes instead of 5
jmp isOut
.....
inc eax
;no need jmp over - we can leave out that jmp
isOut:


to be continued...
Posted on 2002-04-02 18:08:22 by The Svin
This is good analysis Alex, I confess to only being interested in the loop code where the speed matters but the proc can probably be made slightly shorter by improving the entry and exit code.

Regards,

hutch@movsd.com
Posted on 2002-04-03 01:36:03 by hutch--

2.
27 xor ecx, ecx
28 add ecx, startpos ; add starting offset


Please forgive me if I'm wrong, but if you just change it to


add ecx, startpos


Won't you get all the junk values in ecx plus the startpos? I didn't see where ecx was xor'd or set to zero anywhere else (so doesn't it have junk values?)

Sliver

ps. I'm very intrested in this... So thanks for posting it
Posted on 2002-04-03 03:28:59 by Sliver
Sliver,
why not just
mov ecx,startpos
without xor ecx,ecx ;)
look at InString2
Posted on 2002-04-03 03:46:23 by The Svin
Thanks, Steve,
I'm sure you understand that mainly I took your proc just
as an excuse to talk of some general questions of algo
imlemenatations in asm. I could have taken different proc,
but I saw some auxilary advantages to take one from MASM32.
- Many beginners have MASM32 and may after this thread
some of them will be encoraged to take their own analizis of
m32lib and even if don't improve it they at least would understand
it better and it for sure will lead to better results in their coding.
- Anyway finally (in addition to education perpose) we make this
very proc shorter in size and slitely faster so it has also parctical use
multiplyed (in size and speed improvement) by number of projects
wich use InString.

;========================
Note5:
Very offten we have ability to generate some values in register
without using mov reg,constant
mov reg,constant is 5 bytes instruction so
xor eax, eax
instead of
mov eax,0
will save 3 bytes each time it used.
You can very fast look at size of any instruction without any
reference - just type it in debugger and you'll see
(if debugger opened it it would take 1-2 secs)
Almost all debuggers when you start typing in CPU window
take it as if you want to insert instruction, and time to know
opcode of any instruction with any operands = time to type
the instruction and operands.

in InString2 exit with error code -1 condition changed to jc.
so it gives ability to generate -1 by
sbb eax,eax ;eax-eax - CF CF=1 so eax-eax-1= -1
replacing mov eax,-1 by sbb eax,eax gives us 3 bytes saving
on just one instruction.

Note6:
Using results of operations instead of cmp for
jcc.
In InString-2 2 of 3 jcc for error checking done without cmp
Actually there are very often case when you don't need cmp
to know something about value or to compare to values:


from InString:
mov ecx, startpos
dec ecx
js @errm2 ;if ecx was 0 then SF=1

sub ebx,eax ;eax=pLen ebx=sLen if pLen > sLen then CF=1
inc ebx
jc @errm1 ;if ebx was < eax then CF=1

cmp ebx,ecx ;cmp startof last part +1 with start pos
jbe @errm2 ;lenth from start pos of source is less then pattern lenth


For the same things in InString is using cmp, the code above
elemenate a need for 2 of three those cmp decreasing size and
increasing speed.

Note 7: Not correct condition for jcc
have a look(InString2):



cmp eax, sLen
jl @F
mov eax, -1
jmp isOut ; exit if pattern longer than source
@@:

eax = lenth of pattern here, and sLen = len of source
the comment states "exit if pattern longer than source"
but condition for OK code is JL
that means though the author wished to exit with error code
on condition "exit if pattern longer than source" he puts condition
on wich code will exits also when pattern=source
cause
cmp eax,sLen
JL @F
means - go @F if sLen bigger (and not equal) pLen
Posted on 2002-04-03 05:00:20 by The Svin
The first stage of changing and explonation
is done so now we can "collect our coins".
Old procedure is 146 bytes size.
New one is just 106 bytes.
I kept wy word, the algo wasn't changed, all steps
remain, the only difference they are done another way.
To refresh our memory of methods to do so let's revise
what each one of them gives us.
1. We put


mov esi, lpSource ;edi = lpSource
mov edi, lpPattern ;esi = lpPattern

and change parameters to StrLen to:
	

invoke StrLen,esi
mov ebx,eax ;ebx = sLen
invoke StrLen,edi
mov pLen, eax

it gives us saving 4 bytes and 2 clocks
2. We put startpos in ecx in begining
so that we don't have need for the two instructions:


xor ecx, ecx
add ecx, startpos ; add starting offset

that saves us 5 bytes and 3 clocks
3. We reorganize exit for error code
and remove 1 jmp isOut, 1 mov,-2
that saves us 7 bytes and 2 clocks.
We also change mov eax,-1 to sbb eax,eax
that gives us 3 bytes.
Rearranging jcc for errors also gives us
saving from 27 to 48 clocks for three predicted jcc
in case of normal execution.

No don't be lazy, for your sake - look what else was done
and what advantage it gives.

Here are a lot of good programmers.
I'm sure that if some of them not just give improved code
but also give some analysis of what and why can be improved
a lot of beginners would be happy to learn.
There are still just few docs that learn such things in step by step
details.
I know many of you can do it better than me -
SO DO IT :)

Take any proc from M32lib - a lot of them has room for optimizations
Posted on 2002-04-03 06:19:32 by The Svin
The next step I planned to do was not to show some improvement now in main loop, but before it explain how the algo
works, what all the values in the main loop is about.
After having started writing the explonations, I had a simple idea but nobody could describe better how the algo works than the
author.
Steve, could you write a few words, please, the way you did about BM algo, now about how InString works?
Algo itself is always more important than implemetation, and
no core improvement can be done without deep understanding.
If you show to readers sceme of your algo it would be great help.
Posted on 2002-04-03 09:57:59 by The Svin
Alex,

The design of this algo is a simple one, it is a clasic byte scanner that searches for the first character in the pattern being searched for and when it finds that character, it branches to a seperate comparison to see if the rest of the pattern matches the location in the source.

Its non-error results are either the 1 based index of the pattern being searched for or zero if the pattern is not found. It produces minus values as error indicators.

The loop code uses ESI as the pointer to the address of the source data and ECX as the incremented index. The area where there is a speed increase is in the following code,


Loop_Start:
cmp al, [esi+ecx]
je Pre_Sub
inc ecx
cmp ecx, ebx
jne Loop_Start

If the length is added to ESI and the final value of ECX before the loop is sign inverted, the extra comparison CMP ECX, EBX can be avoided which brings the loop length down to 4 instructions.


Loop_Start:
cmp al, [esi+ecx]
je Pre_Sub
inc ecx
jnz Loop_Start

An unrolled version of this shorter loop would see a speed increase as well. Currently it does 1 comparison for 4 instructions, unrolling the loop would give 2 comparisons for 7 instructions with an alternative design using ADD ECX, 2 yielding 2 comparisons for 6 instructions. It would mean that there would have to be extra code for each jump to get the index right but it can be done. Unfortunately any gain here would probably be lost with extra penalties for branch prediction misses.

The branch code that does the seperate comparison could be improved with a similar approach to the address in EDI of the pattern being searched for but it probably would not yield any noticable improvement in the benchmark.

Regards,

hutch@movsd.com
Posted on 2002-04-03 16:07:38 by hutch--
Thanks, Steve. I remember time when you posted after
BM many very good quality routines, and style of them
was clear proof that you apply the same quality to InString.
It just somehow missed your attention.

To All,
I wish to offer you to have a more detailed look on one of interesting moment,
what is in ebx and why cmp ecx,ebx for exit condition.
Because understaning will lead us to one more obvious
improvement - removing one instruction.
for example we have string X as string and Y as pattern.
x = 0123456789
y = 0123
Steve uses sizeofX - sizeofY
to find one magic value
This value is index of element in x string so that
size of part of X from this element to end of X = size of Y.
Lets call the index a, element X and a = sizeofX-sizeofY
This element is in last position were it has sence to cmp Y pattern
with X part from. Next part (from X to X end) will be shorter
than pattern and for only that can not be equal.
in our example
sizeof x = 10
sizeof y = 4
10-4=6
so a = 6 x[6] is last element we have sence to cmp from
!!!!!!! this f*cking editor drives me mad!
how can position 0 of y string under 6 of x string?!


x 0123456789
y 0123


Steve uses value a+1 as chek for exit condition
when index pointer in ecx reach value that out of the limit X
we does in code operations



sub sLen, eax ; don't read past string end [a]
inc sLen ;[a+1]
mov ebx, sLen ;[a+1] in ebx
...
Loop_Start:
cmp al, [esi+ecx]
je Pre_Sub
inc ecx
cmp ecx, ebx
jne Loop_Start

jmp No_Match
...

May be you think why I talking to much of such an obvious
thing? :)
May be 'cause nobody yet said that there is no need for
inc sLen operation as no need to know a+1 value
'cause if ecx <= a it is OK or in other words
if ebx not < ecx it is OK :)
or


; inc sLen ;[a+1] remove this instruction
and just:

cmp ebx,ecx ;in ebx a
jnc Loop_Start


Now our optimized code has 105 bytes instead of 106 of previous and 146 of original
and speed it up a little:


InString proc startpos:DWORD,lpSource:DWORD,lpPattern:DWORD
LOCAL pLen:DWORD

push ebx
push esi
push edi
mov esi, lpSource
mov edi, lpPattern

invoke StrLen,esi
mov ebx,eax ;ebx = sLen
invoke StrLen,edi
mov pLen, eax

mov ecx,startpos
dec ecx
js @errm2

sub ebx,eax ;eax=pLen ebx=sLen if pLen > sLen then CF=1
; inc ebx ; we don't need it anymore
jc @errm1

cmp ebx,ecx ;cmp startof last part +1 with start pos
jc @errm2 ;lenth from start pos of source is less then pattern lenth

mov al,[edi]

jmp Loop_Start
;@@@@@@@@@@@@@

Pre_Loop:
pop ecx
inc ecx
Loop_Start:
cmp al,[esi+ecx]
je Pre_Sub
inc ecx
cmp ebx,ecx
jnc Loop_Start
xor eax,eax
jmp isOut
Pre_Sub:
push ecx
mov edx,pLen
mov edi,lpPattern
Sub_Loop:
mov ah,[esi+ecx]
cmp ah,[edi]
jne Pre_Loop
inc ecx
inc edi
dec edx
jnz Sub_Loop

pop eax
inc eax
isOut:
pop edi
pop esi
pop ebx

ret
@errm2:
mov eax,-2
jmp isOut
@errm1:
sbb eax,eax
jmp isOut

InString endp
Posted on 2002-04-03 17:58:00 by The Svin
Alex,

I took the original algorithm and put it into a test piece to play with and did a couple of the things I had in mind, the main one was to make a 4 instruction scan loop which should make the algo a bit faster.

I have not bothered much with the conditional code at the beginning but tidied up a few places where it was obvious.

This version should be a bit beter.

Regards,

hutch@movsd.com


; #########################################################################

InStringx proc startpos:DWORD,lpSource:DWORD,lpPattern:DWORD

; ------------------------------------------------------------------
; InString searches for a substring in a larger string and if it is
; found, it returns its position in eax.
;
; It uses a one (1) based character index (1st character is 1,
; 2nd is 2 etc...) for both the "StartPos" parameter and the returned
; character position.
;
; Return Values.
; If the function succeeds, it returns the 1 based index of the start
; of the substring.
; 0 = no match found
; -1 = substring same length or longer than main string
; -2 = "StartPos" parameter out of range (less than 1 or longer than
; main string)
; ------------------------------------------------------------------

LOCAL sLen:DWORD
LOCAL pLen:DWORD

push ebx
push esi
push edi

invoke StrLen,lpSource
mov sLen, eax ; source length
invoke StrLen,lpPattern
mov pLen, eax ; pattern length

cmp startpos, 1
jge @F
mov eax, -2
jmp isOut ; exit if startpos not 1 or greater
@@:

dec startpos ; correct from 1 to 0 based index

cmp eax, sLen
jl @F
mov eax, -1
jmp isOut ; exit if pattern longer than source
@@:

sub sLen, eax ; don't read past string end
inc sLen

mov ecx, sLen
cmp ecx, startpos
jg @F
mov eax, -2
jmp isOut ; exit if startpos is past end
@@:
; ----------------
; setup loop code
; ----------------
mov esi, lpSource
mov edi, lpPattern
mov al, [edi] ; get 1st char in pattern

add esi, ecx
neg ecx ; invert sign
add ecx, startpos ; add starting offset

jmp Loop_Start

; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Pre_Loop:
pop ecx ; restore ECX
inc ecx ; start on next byte

Loop_Start:
cmp al, [esi+ecx]
je Pre_Sub
inc ecx
jnz Loop_Start

jmp No_Match

Pre_Sub:
push ecx ; preserve ECX
mov edx, pLen
mov edi, lpPattern

Sub_Loop:
mov ah, [esi+ecx]
cmp ah, [edi]
jne Pre_Loop ; jump back on mismatch
inc edi
inc ecx
dec edx
jnz Sub_Loop ; fall through if match

; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

pop ecx ; restore ECX
add ecx, sLen
mov eax, ecx ; match
inc eax
jmp isOut

No_Match:
xor eax, eax

isOut:

pop edi
pop esi
pop ebx

ret

InStringx endp

; ########################################################################


Alex,

One concession, PLEASE format you code with the "["code"]"----"["/code"]" so I can read it in the forum. :grin:
Posted on 2002-04-04 03:44:32 by hutch--
One concession, PLEASE format you code

:)
I am trying, sorry I missed last one somehow.


You seem haven't got time yet to study the posts above.
You proc again 40% bigger and a some slower than it supposed to be with
proper imlementation (e.i. algo unchanged but implemented right way)
I again try to keep it as pure optimization topic and don't change any
step of your new algo, I change only the way how those steps can be
done.
As to algo itself - there are also too weak part:

1.on start of Sub_Loop for the first iteration you check byte wich
is already cheked, it can be improved by putting inc registers in start of
loop with inc edx je - done as conditions.
2. you can apply using edx as both counter and pointer in Sub_Loop too.
As you did in first loop with ecx.
it eleminates need of mov edi,lpPattern each time and inc of edi.

If anyone is insterested I can show how to do it.
But I don't show it in this version for simple reason - I keep it as pure
implementation topic: "how to improve code without changing algo"
So that your last code and the following one can be taken as
to codes that were generated from the same HLL source but by
different compilers:

(Compare size and number and speed of instructions used to
replace your code)


InString proc startpos:DWORD,lpSource:DWORD,lpPattern:DWORD
; -- ----------------------------------------------------------------
; InString searches for a substring in a larger string and if it is
; found, it returns its position in eax. ;
; It uses a one (1) based character index (1st character is 1,
; 2nd is 2 etc...) for both the "StartPos" parameter and the returned
; character position. ; ; Return Values.
; If the function succeeds, it returns the 1 based index of the start
; of the substring. ; 0 = no match found
; -1 = substring same length or longer than main string
; -2 = "StartPos" parameter out of range (less than 1 or longer than
; main string)
; ------------------------------------------------------------------
LOCAL pLen:DWORD
push ebx
push esi
push edi

mov esi, lpSource
mov edi, lpPattern

invoke StrLen,esi
mov ebx, eax ; ebx = source length
invoke StrLen,edi
mov ecx,startpos ;ecx = startpos
mov pLen, eax ; pattern length
dec ecx ;ecx = startpos -1
js @errm2 ;startpos <= 0? yes - goto error -2
sub ebx,eax ;ebx= sLen - pLen
jc @errm1 ; if sLen < pLen goto error -1
lea esi,[esi][ebx][1] ; esi= address of part wich is < pLen
not ebx ;ebx = -(ebx+1)= - ((sLen - pLen)+1)
add ecx,ebx ; ecx = startpos -1 ebx = - ((sLen - pLen)+1)
;if ecx >= (sLen - pLen)+1 then
jns @errm2 ;ecx - ((sLen-pLen)+1) >= 0 and SF =0
mov al,[edi]
jmp Loop_Start
;@@@@@@@@@

Pre_Loop:
pop ecx ;restore ECX
inc ecx ;start on next byte
Loop_Start:
cmp al,[esi+ecx]
je Pre_Sub
inc ecx
jnz Loop_Start
xor eax,eax
jmp isOut
Pre_Sub:
push ecx ;preserve ECX
mov edx,pLen
mov edi,lpPattern
Sub_Loop:
mov ah,[esi+ecx]
cmp ah,[edi]
jne Pre_Loop
inc edi
inc ecx
dec edx
jnz Sub_Loop ; fall through if match
; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

pop eax
sub eax, ebx ;sub neg value = add positive
inc eax

isOut:
pop edi
pop esi
pop ebx
ret
@errm2:
mov eax,-2
jmp isOut
@errm1:
sbb eax,eax
jmp isOut
InString endp
Posted on 2002-04-04 09:32:27 by The Svin
Here is how to change algo itself to avoid duble checking first byte
wich is already checked.
Also no need for loading in edi each time and increasing edi.


InString proc startpos:DWORD,lpSource:DWORD,lpPattern:DWORD
; ------------------------------------------------------------------
; InString searches for a substring in a larger string and if it is
; found, it returns its position in eax. ;
; It uses a one (1) based character index (1st character is 1,
; 2nd is 2 etc...) for both the "StartPos" parameter and the returned
; character position. ; ; Return Values.
; If the function succeeds, it returns the 1 based index of the start
; of the substring. ; 0 = no match found
; -1 = substring same length or longer than main string
; -2 = "StartPos" parameter out of range (less than 1 or longer than
; main string)
; ------------------------------------------------------------------
LOCAL pLen:DWORD
push ebx
push esi
push edi

mov esi, lpSource
mov edi, lpPattern

invoke StrLen,esi
mov ebx, eax ; ebx = source length
invoke StrLen,edi
mov ecx,startpos ;ecx = startpos
add edi,eax ;edi=end of pattern
neg eax ;edi+eax = start of pattern
mov pLen, eax ; - (pattern length)
dec ecx ;ecx = startpos -1
js @errm2 ;startpos <= 0? yes - goto error -2
add ebx,eax ;ebx= sLen - pLen
js @errm1 ; if sLen < pLen goto error -1
lea esi,[esi][ebx][1] ; esi= address of part wich is < pLen
not ebx ;ebx = -(ebx+1)= - ((sLen - pLen)+1)
add ecx,ebx ; ecx = startpos -1 ebx = - ((sLen - pLen)+1)
;if ecx >= (sLen - pLen)+1 then
jns @errm2 ;ecx - ((sLen-pLen)+1) >= 0 and SF =0
mov al,[edi][eax] ;first byte
jmp Loop_Start
;@@@@@@@@@

Pre_Loop:
pop ecx ;restore ECX
inc ecx ;start on next byte
Loop_Start:
cmp al,[esi+ecx]
je Pre_Sub
inc ecx
jnz Loop_Start
xor eax,eax
jmp isOut
Pre_Sub:
push ecx ;preserve ECX
mov edx,pLen
Sub_Loop:
inc ecx
inc edx ;don't check fist byte
mov ah,[esi+ecx]
je found ;edi+0 = pattern end
cmp ah,[edi][edx]
je Sub_Loop
jmp Pre_Loop

; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
found:
pop eax
sub eax, ebx ;sub neg value = add positive
inc eax

isOut:
pop edi
pop esi
pop ebx
ret
@errm2:
mov eax,-2
jmp isOut
@errm1:
mov eax,-1
jmp isOut
InString endp
end
Posted on 2002-04-04 10:45:03 by The Svin
I've just figured out that in last two version just 1 "-2 error" checking is needed for both error-2 condition :)
I wander if anybody esle can see why so and how to check it at once. ;)
Posted on 2002-04-04 13:15:33 by The Svin
Alex,

I had some time today to have a further play with the InString algo and one mod I tried was to start the comparison on the next byte but every version I tried fails under the condition of searching for a single byte that is at the end of the string.

src db "This is a test of a string algorithm",0
pat db "m",0

It is basically a good idea but this algo was designed particularly to handle anything that could be typed in and searching for a single byte is not uncommon with things like text search in zero terminated strings.

Regards,

hutch@movsd.com
Posted on 2002-04-10 22:57:31 by hutch--
Alex,

I have done some work on the InString algo which has made the match loop shorter. You will note that have not intergrated your suggestions yet as I have been concentrating on the loop code. I wanted a shorter matching loop so I placed the curent scan address in EBX and used ECX as an index so that the exit could be done with JNZ to save the extra compare.

The matching loop must match the first character as it is not uncommon with an algo of this type for a user to search for a single byte which may be at the end of the string being searched.

Regards,

hutch@movsd.com



; #########################################################################

InStringx proc startpos:DWORD,lpSource:DWORD,lpPattern:DWORD

; ------------------------------------------------------------------
; InString searches for a substring in a larger string and if it is
; found, it returns its position in eax.
;
; It uses a one (1) based character index (1st character is 1,
; 2nd is 2 etc...) for both the "StartPos" parameter and the returned
; character position.
;
; Return Values.
; If the function succeeds, it returns the 1 based index of the start
; of the substring.
; 0 = no match found
; -1 = substring same length or longer than main string
; -2 = "StartPos" parameter out of range (less than 1 or longer than
; main string)
; ------------------------------------------------------------------

LOCAL sLen:DWORD
LOCAL pLen:DWORD

push ebx
push esi
push edi

invoke StrLen,lpSource
mov sLen, eax ; source length
invoke StrLen,lpPattern
mov edx, eax ; pattern length

cmp startpos, 1
jge @F
mov eax, -2
jmp isOut ; exit if startpos not 1 or greater
@@:

dec startpos ; correct from 1 to 0 based index

cmp eax, sLen
jl @F
mov eax, -1
jmp isOut ; exit if pattern longer than source
@@:

sub sLen, eax ; don't read past string end
inc sLen

mov ecx, sLen
cmp ecx, startpos
jg @F
mov eax, -2
jmp isOut ; exit if startpos is past end
@@:
; ----------------
; setup loop code
; ----------------
mov esi, lpSource
mov edi, lpPattern
mov al, [edi] ; get 1st char in pattern
add edi, edx ; set EDI to value for match loop

add esi, ecx
neg ecx ; invert sign
add ecx, startpos ; add starting offset

jmp Scan_Loop

; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Pre_Scan:
pop ecx ; restore ECX
inc ecx ; start on next byte

Scan_Loop:
cmp al, [esi+ecx]
je Pre_Match
inc ecx
jnz Scan_Loop

jmp No_Match

Pre_Match:
push ecx ; preserve ECX
mov ebx, esi ; put current scan
add ebx, ecx ; location in EBX
mov ecx, edx ; put pattern length into ECX
add ebx, ecx ; add it to scan location
neg ecx ; invert sign for index

Test_Match:
mov ah, [ebx+ecx]
cmp ah, [edi+ecx]
jne Pre_Scan ; jump back on mismatch
inc ecx
jnz Test_Match

; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

pop ecx ; restore ECX
add ecx, sLen
mov eax, ecx ; match
inc eax
jmp isOut

No_Match:
xor eax, eax

isOut:

pop edi
pop esi
pop ebx

ret

InStringx endp

; ########################################################################
Posted on 2002-04-11 06:30:39 by hutch--
looking at the code
i have few suggestions



push ebx
push esi
push edi

cmp startpos, 1

jge @F
mov eax, -2
jmp isOut ; exit if startpos not 1 or greater
@@:

dec startpos ; correct from 1 to 0 based index

invoke StrLen,lpPattern
mov edx, eax ; pattern length
invoke StrLen,lpSource
; mov sLen, eax ; source length




cmp edx, eax
jl @F
mov eax, -1
jmp isOut ; exit if pattern longer than source
@@:


mov sLen, eax ; source length

sub eax,edx
inc eax

cmp eax, startpos
jg @F
mov eax, -2
jmp isOut ; exit if startpos is past end
@@:
; ----------------
; setup loop code
; ----------------
mov esi, lpSource
mov edi, lpPattern
mov cl, [edi] ; get 1st char in pattern
add edi, edx ; set EDI to value for match loop

add esi, eax
neg eax ; invert sign
add eax, startpos ; add starting offset

;jmp Scan_Loop
;instead of the jump do

push edx

; @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@


Scan_Loop:
cmp cl, [esi+eax]
je Pre_Match
Pre_Scan:
inc eax
jnz Scan_Loop
; the loop set eax to set if not found

jmp @loopout

Pre_Match:
lea ebx,[esi+eax]
mov edx,[esp] ; get edx back instead pop and push
add ebx,edx
neg edx
Test_Match:
mov ch, [ebx+edx]
cmp ch, [edi+edx]
jne Pre_Scan ; jump back on mismatch
inc edx
jnz Test_Match

add eax,sLen
inc eax


@loopout:
add esp,4 ; instead of this . make local var and save edx there
isOut:
pop edi
pop esi
pop ebx



bye
eko
Posted on 2002-04-13 13:18:46 by eko
another thinking :



Pre_Match:

mov edx,[esp] ; get edx back instead pop and push
lea ebx,[esi+eax]
Test_Match:
mov ch, [edi+edx]
cmp ch, [ebx+edx]
jne Pre_Scan ; jump back on mismatch
dec edx
jnz Test_Match
Posted on 2002-04-13 15:17:13 by eko
eko,

Compliments on your suggestion, you have some good coding techniques in it. I will have a play with it in this algo.

Regards,

hutch@movsd.com

Later : The only problem is that the algo does not work in this form, it just returns -1
Posted on 2002-04-13 17:01:08 by hutch--