Hello to all of you,

I know this sounds pretty stupid but i`ve been working for 1 week at a string spliting algorithm and i still haven`t got it right.
It`s something like this : i have a string eg 1ab#6m|gru*a1d5 and i want to split it into 4 strings (1ab, 6m, gru, a1d5) following the non alfanumeric chars.
If anybody has anything that could help ( source codes, ideeas ,links ,snippets or anything else usefull) please share then with me.

Posted on 2002-04-02 14:30:08 by ViperV`
Use the masmlib functions InString, szMid, szLeft, and szRight, those will get what you need in about 8 lines of code give or take a few lines :)
Posted on 2002-04-02 14:54:57 by Asm_Freak
scan through the whole string, examining each byte. If it's a non-alpha-numeric character, replace it by a zero byte. When you've done the whole string you will have a serie of null-terminated strings containg the non-alpha-numeric parts.


Posted on 2002-04-02 14:58:21 by Thomas
Just curious,
Thomas how do you know when your string ends... inotherwords,

Assuming you use a buffer of undetermined length...


What would you do to find the end of the buffer now?
you can't do

mov al, byte ptr ; char. in buffer
cmp al, 0

or can you??


ps. this is an extension question (not sure it helps him, but I would liek to know)
Posted on 2002-04-02 15:27:48 by Sliver
Afternoon, Sliver..
You'd just check for two "0"s in a row.
i.e. a double 0(zero) indicates the end of the string.

Posted on 2002-04-02 17:45:56 by Scronty
Thanks Scronty, but what happens if the data is arrange such that two or more zeros are before a piece of data:




ps. Thanks for being patient with this line of questions :)
Posted on 2002-04-02 20:22:00 by Sliver
Some problems are solved by using more than one step. For the general case (allowing two or more non-alfa's in a row), get the original string length (or alternately, the address of the end of the string) and store it before parsing.
Posted on 2002-04-02 20:29:39 by tenkey
There was reciently another thread that may be of some use to you as well, its discussion relates to your question.

More Parsing Help...

Hope it helps..
Posted on 2002-04-03 00:53:14 by NaN
I only read the first 3 replys so far. How about if you read in 4 or more difference strings to the same buffer (un-known lenght) ( SetEndOfBuff) THAN do with it what the ViperV want to do..

I think an solution to this will lead to the answer of all other forms.
Posted on 2002-04-03 01:30:27 by cmax
Thomas's solution seems to be the elegant one, replace each indicator character with a zero and append a zero on the terminating zero. Its a single scan proc that should be reasonably fast.


Posted on 2002-04-03 01:30:33 by hutch--
When you scan the bytes on the strings remember:

ascii string 0 - 9 runs from 48 - 57
ascii string A - Z runs from 65 - 90
ascii string a - z runs from 97 - 122

As you can see there is a range of values.... I'm sure you got the idea ;)
Posted on 2002-04-03 01:40:19 by stryker
im curious to what thomas has posted. for instance if i do what thomas suggests and place zero at each non-alpha-numeric char how would i access each of these: "ab3",0,"admfe",0,"kad23",0 after im done? so if i wanted to display them in a message box? would i need to do further string minipulation or can i access them similar to an array?
Posted on 2002-04-03 04:47:57 by smurf
I just showed the basic idea, not a full proof solution. If you store the length before you start processing, you can read the full buffer correctly afterwards, regardless of the data in the buffer. Just skip the zero length strings.
If you want to display them in a messagebox, you first display one string which address is in, say, ebx. after you displayed it you increase ebx until you've found a 0 byte. Then scan further until you find the first non-null character after the 0-byte(s). Display the string at this pointer and do the whole thing again.
You can use the string length you stored to determine whether you're at the end of the strings.

Posted on 2002-04-03 06:44:50 by Thomas
Ingenious methods.
Thanks for sharing.
Posted on 2002-04-03 14:28:39 by ViperV`
re: termination point:
you could have some string like this:

"blahblah",255,"blah blah",255,"more blah",0

and whenever a string is displayed, temporarily change the 255 byte into a 0 and then back again, since 255 would never appear otherwise as an alphnum char.
Posted on 2002-04-03 17:18:53 by jademtech

Of course, if you have a choice and your uses are very simple,
using one char is okay, but offten things are more complex.
Posted on 2002-04-03 17:27:00 by bitRAKE