Hi all !
Yesterday, i was bored and i coded this little routine...
I think most people here know about the strncmp function from the C library...
For those who don't, it's a function that allows to compare only some characters of two strings and not all the characters...

It's fairly compact : the main code (minus the "Invoke caused garbage") is 11 bytes.
It seems also to be "reasonably fast".

Warning : It does NOT check for null terminater... because in most cases thats totally useless

Suggestions, advices, and optimizations are welcome !

Hutch, if you want to include it to the MASM32 library you are welcome to. :)

strncmp proto :DWORD, :DWORD, :DWORD

strncmp proc str1:DWORD, str2:DWORD, len:DWORD
;ESI : address of the first string
;EDI : address of the second string
;EAX : number of characters to compare

;EAX : result of the comparison
; EAX == 0 : the strings matched
; EAX != 0 the strings mismatched
;ESI : address of the last character to have been compared (str1)
;EDI : address of the last character to have been compared (str2)
;DH : the last letter to have been compared

mov esi, dword ptr str1
mov edi, dword ptr str2
mov eax, dword ptr len

mov dh, byte ptr [esi]
cmp dh, byte ptr [edi]
jne @F
inc esi
inc edi
dec eax
jnz @B


strncmp endp

Cya ;).
Posted on 2001-10-12 10:05:14 by JCP
hmmm, you version doesn't conform to the libc version of strncmp...

< 0 string1 substring less than string2 substring
0 string1 substring identical to string2 substring
> 0 string1 substring greater than string2 substring

Perhaps you can load al from string1, sub the byte from string2,
and do compare that way? It should satisfy the demand about
return value...

and you really ought to do some NUL char checking.

or, well, I guess you don't :). Not in the current form anyway. Silly me.
Posted on 2001-10-12 11:08:47 by f0dder
The actual definition of strncmp states that the return value is:
<0 if string1 is less than string2
0 if string1 is equal to string2
>0 if string1 is greater than string2

Its easy to get this with code very similar to yours tho!

strncmp proc str1:DWORD, str2:DWORD, len:DWORD
mov esi, dword ptr str1
mov edi, dword ptr str2
mov edx, dword ptr len
xor eax, eax

mov al, byte ptr [esi]
sub al, byte ptr [edi]
jne @F
inc esi
inc edi
dec edx
jnz @B

strncmp endp

I think that should do it, but you may need to swap esi & edi around (I've been down the pub and its Friday, so getting the strings the right way around is the last thing on my mind)!

Posted on 2001-10-12 11:16:36 by Mirno
Damn you fodder, damn you and all your kind.

If only my stubby little fingers were that little bit quicker, and my brain was a little less spongy.

Posted on 2001-10-12 11:18:08 by Mirno
hehe Mirno :). But I'm afraid there might be some problems with
your code (I'm a bit distracted right now, so I might be totally wrong).

First, what happens if the two strings are equal, and of same length?
When you sub NUL from NUL, you get NUL, and thus, won't the
loop go on, comparing garbage? Of course this isn't a problem if
you set "len" correctly...

Also, since return values are almost always checked as DWORDs,
shouldn't you add a "movsx eax, al" near the exit point?
Posted on 2001-10-12 11:56:48 by f0dder
Damn you once again you acursed beastie.

The only reference I've got here at the mo doesn't state explicitly what to do in cases where length is greater than the string lengths. So I'm not sure (or in a position to test it).

The sign extension thing is right tho....

Try #2:

strncmp proc str1:DWORD, str2:DWORD, len:DWORD
xchg esi, str1 ;Not sure if this is better than a push, mov, ..., pop
xchg edi, str2 ;Can't be bothered to check instruction lengths...
mov edx, len ;Friday is no day to be doing thinking of any kind....

add esi, edx
add edi, edx
neg edx

mov al, byte ptr [esi + edx]
sub al, byte ptr [edi + edx]
jne @F
inc edx
jnz @B

movsx eax, al
mov esi, str1
mov edi, str2
strncmp endp

Posted on 2001-10-12 12:14:29 by Mirno
"xchg esi, " takes three bytes, so does "mov esi, ".
push/pop variants are one byte. Damn me again? ;)
Posted on 2001-10-12 12:24:06 by f0dder
Let's see... if you replace edi references with ecx you could save the push & pop of the edi register & still be win compliant.
Posted on 2001-10-12 16:07:25 by rafe
I won't damn you so fast fodder!

xchg esi, str1
mov esi, str1 ;As str1 now contains the original esi

;will replace

push esi
mov esi, str1
pop esi

So its a 1byte for 1 instruction, so my point / question was which is faster? I have since got off my fat lazy arse, and it seems its a VERY crappy instruction when used with mem (reg-reg aint bad tho)! So a push-pop solution is better (unless you are really stingy with stack space of course :P ).

Another point I thought of last night was, that "char a" - "char b" could be > 127 in which case it will be wrongly signed....
Given that its a string compare, rather than a byte compare is this particularly important? Not too big a deal to fix though.

Posted on 2001-10-13 05:53:12 by Mirno
mov edi,lpString1
mov ecx,length
mov esi,lpString2
dec ecx
@@: mov al,
cmp ,al
jne notequal
dec ecx
jns @B

Posted on 2001-10-14 02:43:35 by The Svin
Sorry guys, but why don't you never use opcodes like scasb or loop ?
I mean, are they really slower than comparing ? All the times or just in some cases ?
Posted on 2001-10-15 17:39:37 by magicmac

The old instructions still work but Intel have progressively changed the design of the inner workings of processors to favour a subset of the instruction set (RISC) and code written in that preferred subset generally runs a lot faster than the older instructions.

LOOP is particularly slow, you can easily improve the speed of a loop by doing a comparison and a conditional jump after it.


Posted on 2001-10-15 18:27:36 by hutch--
comparison and conditional jump to replace loop... why not just
"dec ecx" followed by "jnz label" ?
Posted on 2001-10-15 18:44:59 by f0dder
Hutch: Ok, but whenever you code a program, you do a search -or you have them in your mind- for which instructions runs faster ?
I know that will be the best to do, but for example, is the first code slower than the second one ?

mov cx, 1000
inc ax
loop sarlanga

mov cx, 1000
inc ax
dec cx
jnz sarlanga

If the first code is slower, is that also true when you use lodsb or scasb ? You mean that is better to manually (mov) transfer the data to a register, compare it with a value, increment esi, decrement ecx and if not zero repeat ?

Thanx for both of you guys.
Posted on 2001-10-15 22:12:24 by magicmac
magicmac, get the intel manuals and look at the instruction clock
cycles :). It's helpful to get manuals for multiple processors, so you
see what is "generally" good and what is good on the newer processors.
www.agner.org is pretty helpful as well.
Posted on 2001-10-15 22:22:32 by f0dder
Ok, thanks again, f0dder...

Thanks. Danke. Gracias. Obrigado. Gratzie. Shishi. Samatushi. Bingo Bongo. Dunga Dunga.

-MagicMac. :alright:
Posted on 2001-10-15 23:52:48 by magicmac
Hutch snuck into the masm (not sure which service pack) help files a table of useful instruction replacements. You have to choose optimizations from the masm toc & not the main the main help menu. It's a start, for the time before you fully digest the Intel & Agner docs.
Posted on 2001-10-16 09:56:12 by rafe

A couple of things, your example is in 16 bit code and in many instances the older instructions are faster in 16 bit code and usually shorter as well.

If you are writing 32 bit code with 32 bit registers, try and avoid the older instructions as they are in most instances not as fast. The two exceptions are MOVSD when used with the REP prefix and STOSD with the REP prefix. They have been optimised reasonably well in current Intel and AMD processors.


Posted on 2001-10-16 10:30:57 by hutch--
Thanks a lot, fellows. :)
Posted on 2001-10-16 10:51:25 by magicmac