Hi
Does anyone know the fastest way to compare a string? Im not after anythin fancy like reversing strings or anything..
Posted on 2001-10-10 05:37:25 by Kezza
Kezza,

Its pretty simple stuff, load the addresses of both strings into 2 registers, scan and compare them one char at a time and exit under the condition you require, set length, mismatch or zero terminator.

In pseudo code,

mov esi, data1
mov edi, data2

label:
mov al,
mov ah,
inc esi
inc edi
cmp al, ah
jne outlabel ; mismatch
cmp al, 0 ; zero terminator
jne label

; if here, match occurred

outlabel:

The pseudo code tests for mismatch and zero terminator. You can also write it for a set length as well.

Regards,

hutch@pbq.com.au
Posted on 2001-10-10 06:16:47 by hutch--
Hutch,
The pseudo-code brings up a question for me. Using ah right after al will it cause a stall? I've been wanting more registers in the worst sort of way & well this gives me some hope at the byte level at least..

thanks

PS: Yes, I know I need to finish reading Agner's doc...
Posted on 2001-10-10 10:07:17 by rafe
rafe, it will not stall on a P6 arch processor as it uses dynamic processor renaming (in fact a different register is used after every modification to the named one! So eax is not in the same location physically on the chip after inc eax).
The only possibility of a stall comes from the fact that al is not actually a subset of eax, and so


mov eax, 44332211h
mov BYTE PTR [ebx], ah ; This will stall

mov eax, 44332200h
mov al, 11h
cmp eax, 44332211h ; This will stall


Hutch's code only uses results from in sub registers, and doesn't mix them (which is where the danger lies).
According to the Intel docs, it can take up to 7 cycles for a write to "retire" (ie. finally merge with the rest of the register), which means a theoritical 21 instructions can get through the processor in this time!

Mirno
Posted on 2001-10-10 10:27:51 by Mirno
Thanks Mirno! & this makes my day :) because it will relax some of the constraints I've been struggling with.
Posted on 2001-10-10 10:48:44 by rafe
Oops... not so fast... this is bad news for the following:

mov cl, [edi]
mov bl, [ecx + CaseTab] ;Stall?
mov cl, [esi]
cmp bl, [ecx + CaseTab] ;Stall?
jne strNE
inc esi
inc edi

As I read your post I must recode to space out the cl/ecx usage.
Posted on 2001-10-10 11:00:48 by rafe
In theory yes, but it depends on what the value of ecx is to start with. The Intel engineers realising the problem provides a special case for zeroing registers. If xor ecx, ecx (or sub ecx, ecx) a special (internal) flag is set so the processor knows that the rest of the value of ecx is irrelevant.

So if you add an xor ecx, ecx before the movs (if it is feasable) it will not stall, and as it is only a 1u-op instruction it should blend in with other code fairly easily (even if it doesn't it'll save time because it isn't stalled! The only thing you loose is the instruction size).

Mirno
Posted on 2001-10-10 11:28:44 by Mirno
ahhh thanks again. shrink code i will do.
Posted on 2001-10-10 11:39:02 by rafe
Rafe,

I have not digested the algo but you are in trouble with the partial register write in CL and the following read from ECX. As Mirno suggested the Intel optimisation where you can use it is either XOR or SUB and this will prevent the stall.

I would be inclined to lay the algo out again with a different register usage to avoid the problem if possible. As far as using the low and high byte of a register, I have never had a problem with doing so over most processors of the last few years and I have tested the process using different registers to make sure. It certainly solves the problem of BYTE size register usage.

Regards,

hutch@pbq.com.au
Posted on 2001-10-10 20:21:43 by hutch--
Your advice will be taken as the alpha definitely has "code" problems. This is much needed info. Efficient algos I can design efficient code (why I use asm) I'm still learning.

I only posted a partial code segment & the xor was done before the top of the loop. That doesn't excuse the sloppy code because I was lucky & not good in this little bit of engineering & I'm trying to remove luck from the equation. Lesson learned now to learn many more.

Thanks again to both of you:alright:
Posted on 2001-10-11 10:02:56 by rafe
hi i have the following code but for some reason if the text in the two edit boxes are different.. my message pops up which is good but it pops up twice??? can anyone tell me why and how to fix it? thanx

.elseif ax==IDB_CA
invoke PostQuitMessage,NULL
.elseif ax==IDB_CHANGE
invoke GetWindowText, hwndEdit2, OFFSET PassBuffer, 20
invoke GetWindowText, hwndEdit3, OFFSET PassBuffer2, 20
mov esi, OFFSET PassBuffer
mov edi, OFFSET PassBuffer2

compareLoop:
mov al,
mov ah,
inc esi ;Goto the next character
inc edi ;Goto the next character
cmp al, ah ;Compare the 2 characters
jne Different ;Jump if the characters being compared are not equal
cmp al, 0 ;Compare to see if the character is 0, because all strings end with 0
jne compareLoop ;If the character is not 0, then goto the start of the loop
invoke SetWindowText,hwndEdit,ADDR TxtEqual
jmp Continue

Different:
invoke MessageBox, hWnd, ADDR MessageBoxError, ADDR AppName, MB_OK + MB_ICONWARNING
jmp Continue

Continue:
Posted on 2001-10-12 01:57:34 by Kezza
It must be ran twice. Do you have any code after 'Continue' which modifies the text in any editboxes?
Posted on 2001-10-12 02:02:04 by gliptic
Why does it have to be run twice? i only want it run once because getting two msgboxes is annoying!!
and no there is no code after continue:
Posted on 2001-10-12 02:04:43 by Kezza
I didn't mean it had to be run twice, I meant I think it does run twice.
Posted on 2001-10-12 02:08:19 by gliptic
hey ok so do u know how to make it only run once?
Posted on 2001-10-12 02:13:05 by Kezza
No. But you should do a check to see which editbox is edited before you run the code.
Posted on 2001-10-12 02:14:57 by gliptic
Hi there,

if AMD or Intel CPU's has really a RISC Core than
this two varaints are propably fast.


Using 16bit at once.
Each string must be terminated by 0, which is,
aligned to WORD, and padded with 0.




lea esi, str1
lea edi, str2

loop: mov ax, [esi]
lea esi, [esi + 2]
mov bx, [edi]
lea edi, [edi + 2]

and ax, 0DFDF ; delete these two lines if
and bx, 0DFDF ; case-sensitve compare nedded

test ah, ah
je endMarker

cmp ax, bx
je loop

rte ; not equal

endMarker:
cmp ax, bx
je equal
rte ; not equal



equal:
rte






Another one would be 32bit at once, but
each string must has to be termnated by 4 Zero's
or say better aligned to DWORD, and padded by Zeros.



lea esi, str1
lea edi, str2

loop: mov eax, [esi]
lea esi, [esi + 4]
mov ebx, [edi]
lea edi, [edi + 4]


and eax, 0DFDFDFDF ; delete these two lines if
and ebx, 0DFDFDFDF ; case-sensitve compare nedded


test eax, 0ff000000h
je endMarker

cmp eax, ebx
je loop

rte ; not equal


endMarker:
cmp eax, ebx
je equal
rte ; not equal



equal:
rte

Posted on 2001-10-26 05:35:08 by marsface
:)
Looking through all your code I keep wondering:
ARE
'pass',0
'pssword',0
THE SAME?

;)
Posted on 2001-10-29 06:52:13 by The Svin
Hi The Svin,

i suppose you mean 'pass' and 'password' ?

Ok let's make some debug with 16 bit at once
Strings has to be terminated by at least 2 zeros.



str1: 70-61-73-73-00-00
str2: 70-61-73-73-77-6F-72-64-00-00


1st pass:

ax=6170
bx=6170

equal-> get next 16bit

2nd pass:
ax=7373
bx=7373

equal-> get next 16bit

3rd pass:
ax=0000
bx=6F77

ah=0 -> endMarker reached -> cmp ax,bx -> not equal

the loop termnates.



If you mean 'pass' and 'pssword' then the
1st compare says that its not equal.




1st pass:

ax=6170
bx=7370

not equal-> leave

Posted on 2001-10-29 09:23:28 by marsface
I see the problem.

There should be some flags set when leaving,
such as

xor eax,eax; if not equal
ret


mov eax,1 ; if equal
ret
Posted on 2001-10-30 07:45:01 by marsface