Since Masm cannot determine if a String is 0 terminated in certain situations can we do something so that a string is not read BEYOND an REAL Characters. EXAMPLE:

MyBuffer db 128............

When the line below is read it into MyBuffer it will read 128 BYTE and as you see there are only 19 BYTE in the string.....My Masm Buffer don't know that... so it is going to write Blank spaces or walk over to the next string and take part of that in if it is in the RANGE of 128 BYTES and at it to compete as 128 byte lenght string witch finally makes since to me...

D:\My Folder\MyFile____________all blanks ________ C:\W

This is a problem that many has seen but what kind of SELF OPERATION can we make to get all of the REAL characters out of the string and leave the JUNK (spaces) BEHIND. Than put it in another Buffer or whatever with the proper amount of characters.....

D:\My Folder\MyFile

The only thing i can think of is to read the string in backwards and force masm to find a legal characters and begin from there. Or to Read only Real Characters into ecx and cut OFF at any Blank Space...

I tried INSTRING from m32.lib but i can get it to work right. I got self operations for other things but i don't know the best way to approach this problems. I am not talking Masm Alinement ....

PLEASeee Help
Posted on 2002-03-22 23:14:30 by cmax
i don't know if this is what you want, but it searches for a non-space, then starts copying the string from there, appending a null-terminator. It can be easily changed to search for different patterns. (i don't have a charmap on me to find ":" or "\" :p)


.radix 16
push MAXSIZE ;max length of the string
lea ebx,StringIn ;location of StringIn

lea eax,StringOut ;writes the null-terminated string here
pop ecx

NextChar:
mov dl, ;reads a char, from end
dec ecx

je NoString ;kills loop if the beginning of the string is reached
;Change here to make a more complex comparison routine
cmp dl,20 ;check for space
je NextChar

mov byte ptr ,0 ;append a zero

WriteNextChar:
mov ,dl ;start copying the string - read
nop
;possibly huge whack of cache penalties here
mov dl, ;write
dec ecx
jne WriteNextChar ;loop

NoString:


hope this helps... (haven't tested it, either, but i commented it)
Posted on 2002-03-22 23:40:55 by jademtech
:) Its a leaning lesson for sure, i dont get too frustrated, the experience will give you good knowledge for simular probs.

(( First i want to say, that the INSTRING version released in the previous MASM32 package (v 6 ??) had a bug in it.. (do a search if your currious ~ i know its on the board somewhere). However, the new release (i believe) is corrected, in V7 that Hutch currently offers on his web site ))

Next... To hack out your problem, you got to ask yourself what do *you* consider legal / illegal for your parser? You say spaces are out, ok. But what else? There is alot of chars to choose from. So a better approach might be what chars do you want to allow?

The ASCII table has A-Z in upper case from 0x41 -> 0x5A. And lower case a-z being 0x61 -> 0x7a. As well there is #'s from 0x30 to 0x39. (0->9 respectively). These are the basics and can be isolated in a character parsing function quite easily. But what else? the is '/' or '\' if your dealing with file/web addresses. As well as the ':' all of which have a unique code to them in the ASCII table.

If you have *alot* of isolated characters you want to pass thru unlike a-z or 0-9 (that are confined to a range), you could use the xlat command and have it filter through a look-up table:
Usage: XLAT translation-table
XLATB (masm 5.x)
Modifies flags: None


Replaces the byte in AL with byte from a user table addressed by
BX. The original value of AL is the index into the translate table.



Like so:


lea edx, RawTextBuffer
push ebx ; Be safe ~ for windows anyways
@@:
mov al, [edx] ; Parse this char for validity
lea ebx, FilterTable
xlat
cmp al, 0
je @F
inc edx ; char is valid
jmp @B
@@:
mov BYTE PTR [edx], 0 ; Add the NULL at first non-valid char
mov eax, edx
lea edx, RawTextBuffer ; Get start addr again..
sub eax, edx ; Eax = Fin - Start = Lenght of Valid buffer
pop ebx ; Keep windows working... :)


WHere the table would be in your " .data " section containing 1's or 0's for valid chars or not. The table will be 256 bytes long, and when ever you want the char to 'pass' place a 1, if not, place a zero.

Space is 0x40 or the 64'th Ascii char value. Thus Placing a '0' in the table's 64'th position will ensure that a space is considered an "illegal" value. Likewise, placing a '1' at 0x5c or the 92'd table entry will allow the " \ " char to be seen as valid. )

This method is *slow* but ez to do. If its not the core of a complicated engine, you wont even notice the 'lag'.

Another way is to pre-filter for the standard ranges (0x41-0x5a, 0x61->7a,0x30->0x39) with simple math on the char, and if its not these, then pass the char as a one last check thru the above algo. (you can ignore placing 1's in the already filtered areas for obvious reasons). This will at least limit you from relying *soley* on xlat, when math can be a faster solution.


Again, it really matters what *you* want to achieve, specifically. All this could be overkill, if all you want to do if look for the first space and call it quits (0x40).

Hope this is some help to you?
:alright:
NaN
Posted on 2002-03-22 23:54:06 by NaN
To determine a string if it is NULL terminated or not, I don't think there is a sole solution, it depends on the data itself, we cannot assume the blank spaces are junk, what if it is actually part of the string. If we talk about string that contains the path and directories in a hardrive, you can try recursing the harddrive and find a string match.

My best bet is to build a pattern failure. What I meant is, you have to recurse all directories and files and check for the closest possible match of your string. Don't assume anything's a match, don't stop comparing till the end.

E.G.

Legend: 0 means a mismatch
Valid Paths:
c:\cmax\x__files_
c:\cmax\x__files__
c:\cmax\x__files___x

String Data: c:\cmax\x__files__xgf


Closest Match 1:

c:\cmax\x__files_
|||||||||||||||||0
c:\cmax\x__files__xgf

Mismatch at position 17 (Save for Null termination later...)

Closest match 2:
c:\cmax\x__files__
||||||||||||||||||0
c:\cmax\x__files__xgf

Our new closest match...
Mismatch at position 18 (Save for Null termination later...)

c:\cmax\x__files__x
|||||||||||||||||||0
c:\cmax\x__files__xgf

Our new closest match...
Mismatch at position 19 (Save for Null termination later...)

After finished recursing... append 0 at the last data whose value is pointing at position 19 of our string.

Our final string: c:\cmax\x__files__x

Then again, we really don't know how to determine if a string is null terminated or not, it depends on you, what you actually want your string data to contain. There are a lot of ways to determine if it's a junk, we can say blank spaces are junk, we can say a is a junk...
Posted on 2002-03-23 00:06:17 by stryker
I founded a way to do it by putting a mark (\) at the end than finding mark and cut from there. I founded in a recent post about strings.

The thing was that when you look a string in a message box it looks perfect and will work if use through-out the program................ BuT when you write it to disk it may someTimes have extra characters at the end of that string depending on how buffers are alined or something.

But anyway with this new information about strings in this post it should end my quest to having an good understanding of what can be done with strings. This is some serious information for me to look into

Looks deeper than i thought

Thanks Again
Posted on 2002-03-24 05:05:56 by cmax
For my two cents, your problem is made harder, actually caused, not by the string data, but by your buffer:



MyBuffer db 128


It would be much more flexible is you define it like so:



pMyBuffer DWORD ?


Just a pointer location, not a fixed length. This way, *ANY* string you encounter at run time can be acommodated by simply getting the string length, adding one, and allocating that much memory and storing the pointer back into pMyBuffer.
Posted on 2002-03-24 08:30:04 by Ernie
Ernie,

You Hit the Nail Right on the Head. It is a Buffer Problem. But up until now it keep me very confussed. So I keep asking the same question over and over again because of differece situations thinking it's something new... Now i know... and with stronger knowlege about strings to BOOT....Posted on 2002-03-24 11:50:52 by cmax