Hello Guys :)

I've just learned Assembly and now tried to write my own Split Function, it is almost really godd, but there are several things. Is it possible to store the result in a kind of array ??
I want to create an dll to work with my split function written in Assembly, it is just for fun and to learn. Maybe you could also tell me some tips what is false in my code and so on. Sorry for my bad English :)


Here's the code:

.386
.model flat, stdcall

include WINDOWS.INC
include kernel32.inc
include user32.inc

includelib kernel32.lib
includelib user32.lib

.data
String db "Ds;Ist;nur;ein;test",0
Deli db ";"
Ergeb db ?

.code

Start:

lea ebx, String ;OFFSET des Strings ermitteln und nach EBX

NewItem:

lea edi, Ergeb ;OFFSET des Ergeb ermitteln und nach EDI
Item: ;Sprungmarke Item

mov ah, ;Inhalt von EBX nach ah
inc ebx ;ebx um 1 erh?hen um zumn?chsten Zeichen zu gelangen

cmp ah,0 ;ist das Zeichen 0
je ende ;dann ENDE

cmp ah,3Bh ;ist das Zeichen ein Strichpunkt ?
je SaveItem ;dann neues Item



mov ,ah ;ax nach edi
inc edi ;edi erhoehen
jmp Item
SaveItem:
INVOKE MessageBox,0,ADDR Ergeb,ADDR String,64
mov Ergeb, 0
jmp NewItem
ende:
INVOKE MessageBox,0,ADDR Ergeb,ADDR String,64
INVOKE ExitProcess,0


End Start


Thanks

Greetings Morti
Posted on 2004-05-18 07:36:50 by Mortimer
Demonstrative code to save tokens into an array:

http://board.flatassembler.net/viewtopic.php?t=1483&p=9257

Greets,
pelaillo
Posted on 2004-05-18 07:51:15 by pelaillo
Thanks for the quick answer but now I just got some questions:

what is the datatype rd for ??? I've never read of it before and it doesn't seem to work under masm
Also I got errors at the cmp byte lines.

Thank

Morti
Posted on 2004-05-18 08:40:26 by Mortimer
it is in fasm, you need to convert it to masm before using it. Roughly like


.code
Start:
mov esi,offset TheString
mov ecx,offset TheLength
mov edi,offset TheWords
call Tokenizer
int3

Tokenizer:
xor edx,edx
.skipWhitespace:
cmp byte ptr[esi],' '
je .skip
cmp byte ptr[esi],09h
je .skip
cmp byte ptr[esi],0Dh
je .skip
cmp byte ptr[esi],0Ah
jne .doneWhitespace
.skip:
inc esi
loop .skipWhitespace
.doneWhitespace:
mov [edi+edx*4],esi
inc edx
.seek:
lodsb
or al,al
je .end
cmp al,' '
je .done
cmp al,09h
je .done
cmp al,0Ah
je .done
cmp al,0Dh
je .done
cmp al,ah
je .done
loop .seek
jmp .end
.done:
mov byte [esi-1],0
loop .skipWhitespace
.end:
ret

.data
TheWords dd 100h dup(?); Space reserved for 256 words
TheString db 'This is our string, the string we are going to use',0

Posted on 2004-05-18 08:45:55 by roticv
Thanks for that really quick reply. That's really very cool :)
Now I got another questions (I know they are terrible the Newbies *gg*)
For what stands the int3 ???


Morti
Posted on 2004-05-18 09:10:06 by Mortimer
ok just found out that it's for debugging reasons. But I have a problem, I can't debug the source:

.386
.model flat, stdcall

.data
TheWords dd 100h dup(?); Space reserved for 256 words
TheString db 'This is our string, the string we are going to use',0
TheLength = $ - TheString

.code
Start:
mov esi,offset TheString
mov ecx,offset TheLength
mov edi,offset TheWords
call Tokenizer
int 3

Tokenizer:
xor edx,edx
skipWhitespace:
cmp byte ptr,' ' ;just normal space
je skip
cmp byte ptr,09h
je skip
cmp byte ptr,0Dh
je skip
cmp byte ptr,0Ah
jne doneWhitespace
skip:
inc esi ;go to the next letter
loop skipWhitespace ;and jump to skipWhitspace
doneWhitespace:
mov ,esi
inc edx
seek:
lodsb
or al,al
je ende
cmp al,' '
je done
cmp al,09h
je done
cmp al,0Ah
je done
cmp al,0Dh
je done
cmp al,ah
je done
loop seek
jmp ende
done:
mov byte ptr ,0
loop skipWhitespace
ende:
ret

End Start

thanks to all !!!!

morti
Posted on 2004-05-18 09:15:53 by Mortimer
I guess the int3 is there to check whether the function is working correctly or not. Not a big fuss. You can try lanuch the code in a debugger like (ollydbg, go google for it if you do not have it) and step into the code one at a time.
Posted on 2004-05-18 09:46:20 by roticv
TheWords dd 100h dup(?); Space reserved for 256 words
That would reserve 256 DWORDS
mov ecx,offset TheLength
This would put the memory address of the TheLength variable into the ECX register. That is NOT what is required in ECX but the actual value of TheLength variable to be used as a counter. In MASM syntax, it should be:
mov ecx,TheLength

Raymond
Posted on 2004-05-18 11:10:26 by Raymond
Originally posted by Raymond
That would reserve 256 DWORDS


This is because some english *words* are overloaded. I mean words as in a sentence, not words as data types.
That array is intended to store a dword pointer to each single word in a given sentence.

:alright:
Posted on 2004-05-18 14:46:02 by pelaillo