Say I have two pointers to two strings, and I wish to do a search in string 1 for string 2. Trivial for exact match, but is there some nice technique for doing it case insensative besides a grunt work check if in range for A-Z?
Note either strings may hold any legal character (nubmer, punctuation, control, etc), so just ignoring the 6th bit is not acceptable (one rairly wants '[' to exactly match '{' ).
It's quite simple: bring them to the same case and then
compare them.
It can be done dinamically in one loop or divided in two parts
to convert first and then compare.
Using appropriate way is havilly dependable on your conditions
Do you care of both or one of values, how long the strings
suppous to be ect. ect.
I'm pro asm+database programmer and in my job there a lot
of formation and copmaring tasks, after having been looking quite a while for proc major strategy I began developing teq.
using tables and copmlex index addressing - for my job in
lots of cases it became the way to get shortest and fastest result. If you use big table just for one proc it might seem
quit memory consuming, but when you write a set of procs, and
almost all of them you use in fairly big project then size
of tables is almost nothing, you just put them in global section
making them available for any proc, and writing procs passing
them as parameter address of table.
For example you can use the same table for UCASE conversions
caseunsencetive search and comparing ect(upto 100 procs!).
Create table of uninterraptive array and replace value of
members UCASE with lowcase or in opposit. Then you don't need
to check each time if it casesomething.
Most universal whay with this will something like.
edi pointer of one string
esi pointer to another
some where before the loop
xor eax,eax
xor edx,edx
@@:
in loop
mov al,
mov dl,
mov al,tbl
mov dl,tbl
inc edi
cmp al,dl
jne notequ
inc esi
dec ecx
jne @B
Please note: it takes ~ 5-7 clocks for byte (iteration)
and rep scasb 12once and + 4 byte for each byte comparasion
So it will be fastest whay you never get using any other teq
without table. In addition a lot of M32LIB procs useless in other
not English speaking countries just for it use ULCASE proc with
just first halve of ASCII table and other halve is national
relative, but if you replace those procs with ones using tables
they can be used there programmers needs just make their country
seciffic table and pass pointer of them to the procs.
The Svin.
Ernie,
I know how it can be done but I don't consider that it is
optimal code, case sensitive search is reasonably straight
forward and what can be done easily enough is to convert the
text to one case and do the comparison that way. It means an
extra pass on the information being searched but case comversion
is genuinely fast so it may not be a problem.
I have seen a Boyer Moore written by Jeremy Collake that may do
the job but I don't know if it has the precision you require. Also
I think from memory, Jeremy's algos only worked in absolute addresses,
not zero based offsets in the string to be searched.
Regards,
hutch@pbq.com.au
invoke strcmpi,addr string1,addr string2
why reinvent the wheel?
umbongo========================================
invoke strcmpi,addr string1,addr string2
========================================
why reinvent the wheel?
Not a bad point actually, it only matters if its speed critical
but I know Ernie is a genuine speed freak. :)
Regards,
hutch@pbq.com.au
please be careful with manipulating single byte strings,
and think to the german or arabic people, which will get
problems with there special characters.
Windows function work (mostly :)) proper.
Sure, I like speed, I find it annoying to watch my system thrash for 30 secs just to read the HELP file.
In this case, it's a recursive "find in files" text search. The app target is W9x ASCII. I wouldn't bother trying an international app without full unicode support, in which case I'd probably be stuck with the text API's for comparison.
But in my case it's straight ascii comparing bytes. I've already done a few macros to do inline text cat and copy for output format (the case sensative case is already in the can).
Since I'm checking every single character in the file, simple is better. The longest single task will be this search, and I was looking for something tight and light.
Umm, yeah I know about lstrcmpi. It's works. I just want something faster.
-----------------------------
"I saw weird stuff in that place last night.
Weird, strange, sick, twisted, eerie, godless, evil stuff.
And I want in."
Sure, I like speed, I find it annoying to watch my system thrash for 30 secs just to read the HELP file.
You want my system at work, they have a policy of running anti-virus software on every single file. I mean everything, .hlp .html .tmp - the lot, it takes 2 minutes to open the VC++ help file!!!!!!
If you want to optimise anything call norton and work for them......
Umbongo
This message was edited by umbongo, on 3/23/2001 1:05:00 PMI have never used any virus protection software in the last twenty years :) Only lost two hardrives: first one wasn't backed-up - spent a month getting the source code off of it. :(
I don't think I'll ever use virus protection software - like I don't get a flu shot every year. I do use condoms though :) - err, mostly :)
Excute me while I knock on wood :)
bitRAKE
Ernie,
This library module is more or less hot off the press but it is testing up
OK so far and you will find it will rip the titz of API calls.
Now the down side is that its case sensitive search and it has no error
checking built into it so if you pass an incorrect parameter, it
will crash. In particular, specifying the starting position to search
from at a position past the source length minus the search string
length will cause a page fault.
In the last service pack, the 2 case conversion modules were made
a lot faster so converting a buffer to uppercase then searching it
would probably be very fast as the case conversion would get it into
cache.
Regards,
hutch@pbq.com.au
; #########################################################################
.386
.model flat, stdcall
option casemap :none ; case sensitive
.code
; #########################################################################
bSearch proc StartPos:DWORD,lpString:DWORD,lnStrng:DWORD,
lpSubStr:DWORD,lnSubSt:DWORD
push ebx
push esi
push edi
mov esi, lpSubStr
mov ah, ; 1st byte in AH
mov esi, lpString ; source in ESI
add esi, StartPos
mov edx, lnStrng ; len in EDX
lea edx, ; add esi to edx + 1
sub edx, lnSubSt ; subtract substr len from source len
; ---------------------- Main Loop --------------------------
bsSt:
mov al,
inc esi
cmp al, ah ; cmp byte to start character
je sbLoop ; jump to subloop if equal
cmp esi, edx ; test exit condition
jne bsSt ; jump back if not equal
mov eax, -1 ; -1 = no match
jmp bsOut
; -----------------------------------------------------------
sbLoop:
push esi ; source offset is in ESI
dec esi ; start comparison from correct position
mov edi, lpSubStr ; substring offset in EDI
mov ecx, lnSubSt ; substr len in ECX
; ---------------------- Sub Loop ---------------------------
cmSt:
mov al, ; read source byte
mov bl, ; read search byte
inc esi
inc edi
cmp al, bl ; test byte
jne cmOut ; exit subloop if not matching
dec ecx ; decrement counter
jnz cmSt ; return to compare next byte if ECX not zero
; -----------------------------------------------------------
; if ECX = 0 fall through to following instruction
; ---------------------- If Match ---------------------------
pop esi ; restore esi
dec esi ; correct to get true position
sub esi, lpString ; sub start address from esi
mov eax, esi ; copy to eax as return value
jmp bsOut
; ------------------------ Else -----------------------------
cmOut:
pop esi ; restore ESI
jmp bsSt ; return to start
bsOut:
pop edi
pop esi
pop ebx
ret
bSearch endp
; #########################################################################
end
This message was edited by hutch--, on 3/23/2001 5:44:32 PMThere is no point to convert all before the hand.
Look, try, clock it and think step by step about different scenarious in different procs...
It's very hard for me explain code in Engilsh, a lot more diffical than write it.
You must excuse it, please.
.data
;If English 128 for Russian and other lang - 256 bytes tables
uctbl db 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
db 16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
db 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47
db 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63
db 64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79
db 80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95
db 96,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79
db 80,81,82,83,84,85,86,87,88,89,90,123,124,125,126,127
.code
.......
invoke Strlen,addr Str1
invoke CUStrcmp,addr Str1, addr Str2,eax,addr uctbl
or eax,eax ; if Str1 and Str2 are equal caseuns.?
......
invoke Ucase,addr OtherStr,addr uctbl ;you may use the table for a lot of procs passing addr
......
invoke CUInstr,addr OneMoreStr,addr ToSearchStr,index,addr uctbl
CUStrcmp proc lpstr1:DWORD,lpstr2:DWORD,lenth:DWORD,tbladrr:DWORD
push edi
push ebx
push esi
xor eax,eax
xor edx,edx
mov ecx,lenth
mov ebx,tbladdr
mov esi,lpstr1
mov edi,lpstr2
@@: mov al,
mov dl,
mov al,
mov dl,
inc esi
inc edi
cmp al,dl
jne @F
dec ecx
jne @B
mov eax,1 ;TRUE - equal
jmp done
@@: xor eax,eax
done: pop esi
pop ebx
pop edi
ret
CUStrcmp endp
Ucase proc lpszStr:DWORD,tbladdr:DWORD
xor eax,eax
mov edx,lpszStr
mov ecx,tbladdr
@@: mov al,
mov al,
mov ,al
inc edx
or al,al ;if end of asciiz string
jne @B
ret
Ucase endp
InStrCU
CUInStr proc ...
Steve can write himself or I can if anybody wants :)
The Svin
(added code tags -- ernie)
This message was edited by Ernie, on 3/24/2001 2:30:40 AMAlex,
I have just learnt something useful when using tables although in
a slightly different context, it is very important to have the table
aligned even when you are making BYTE reads from it. I worked on
an algorithm with a friend of mine that uses XLATB and got a wide
range of speeds depending on the alignment of the 256 byte table,
even though XLATB reads in BYTE size.
Regards,
hutch@pbq.com.au
Hmmm... I didn't get it :)
How array of bytes can be aligned?
For example if you alinged array tbl by 4 then tbl[0] tbl[4] tbl .... would be alinged
but tbl[1],[2],[3],[5],[6].... will be missaligned.
The same with even - only one address of two can be even.
You can't aling addresses of all bytes in byte array be 2 or 4.
Did I miss something?
That point about tables needing to be aligned - it is quite true.
The only reason I can think is, even though the values themselves are read as bytes - the hardware doesn't!
For each memory access, a value is read using the full bus width (ie a DWORD), regardless of whether this is to used as a byte, word, or dword
once the dword is read - then it is masked and shifted accordingly if needed to fit the instruction
So, the first byte read, in an aligned table will still need to be masked and shifted, but once this is done - the next byte to be read is already in the pipeline and has already been read
so this is quick (and so for the next two bytes after that)
But if the table is misaligned then this 'cached-bytes' effect is partly lost, since the loss on the first byte isn't made up by the next three
Although, I'm not sure how much impact this will cause for byte tables, since only the first (hardware)dword is misaligned.
But for word and dword tables it can be very bad, since usually two dword fetches will be required just to fetch a single (soft)dword
(...which would be like getting a chache-miss on every second read - ouch!)
As far as I know, reads are done at full bus width, and only selected bytes are used (via the Byte Enables).
The same is true of writes in reverse, but the rest of the data is discarded.
Because the data is read anyway, it is usually cached, as there is a good cost/performance ratio, and it is also usually the case that there will be subsequent memory accesses (such as tables).
Mirno
The Intel cache is pretty strange to me.
Instead of a single cache, it's actually a bunch of little caches, each four levels deep (has this changed in the latest chips?) So it's extremely sensitive to access patterns.
In the 486, the bad number was 2048. If you accessed data that was a multiple of 2K apart, you quickly flushed out the mini-cache.
The 486 was worse than the Pentium because code and data bytes were mixed together in the L1 mini-caches. A loop with two levels of subroutine calls spaced 2K apart (three code sections), add in a couple of data accesses 2K apart from each other and the code -- and you get a lot of cache misses.
Even worse is the 4K page size used for VM (virtual memory). If you have lots of RAM, this won't be a problem. But if you don't, you can have a similar scenario (with the bad number 4096) and get thrashing.
With the L1 cache, each cache line in the 486 is 16 bytes (it's 32 in the Pentiums), so the 2K figure is pretty close to reality. With the VM pages, it's better to state reality -- it's how many 4K pages you need (vs. how many you can hold in RAM) that determines whether you thrash or not.
Another thing to keep in mind about byte writes -- if the hardware does not support byte selection during writes, at least one whole dword will need to be read in before writing. If that dword is cached, a whole 32-byte cache line in a Pentium must be filled (meaning eight dwords must be read!) I'm not really sure of the exact Pentium behavior.
To Ted:
Hmmm... It makes sence...
Though not clear enogh for me yet. :)
There always is something strange and unexplained
when testing Intel procs with an oscilograph.
To Ernie:
In case you're going to use tables in search in files
you need then 256 byte table with second part (index127-255)
filled with the same numbers as indexes values.
The Svin.