How do i add a byte to a string manually.

Buffer1 = 123

Buffer2 = 4

I want to make Buffer1 = 1234

Without calling catstring all the time. Sometimes I only got a byte or two to add to an string

Thanks in advance
Posted on 2003-06-29 01:17:17 by cmax
mov edi,OFFSET String1
invoke StrLen,OFFSET String1
add edi,eax
mov ,BYTE PTR "4"
mov ,BYTE PTR 0
Posted on 2003-06-29 01:19:36 by donkey
Thanks donkey

I was hoping that i could do this with out calling anything. Something like

mov esi, ___
mov edi,___
inc
and make it work

or something like that.

But is the this that the only Quickest way. I'm trying to do some real assembler and get away from calling the api and other stuff if i don't need to for little thing like this. I am long over due.
Posted on 2003-06-29 01:38:03 by cmax
I wrote this as donkey posted, (not trying to compete, i just typed slow today lol)
but you dont have to use strlen assuming buffer1 is null terminated.




; #################################################

.486
.model flat, stdcall
option casemap :none ; case sensitive

; #################################################

include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
include \masm32\include\gdi32.inc

includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\gdi32.lib

main PROTO

.data
Buffer1 db "12345",0, 10 dup(0)
Buffer2 db "6789",0

; #################################################
.code

start:

call Cat
invoke ExitProcess,0

; #################################################

Cat proc

mov esi, OFFSET Buffer1
mov edi, OFFSET Buffer2
@@:
mov al, BYTE PTR [esi]
inc esi

cmp al, 0
jne @B

dec esi
@@:
mov al, BYTE PTR [edi]
inc edi

mov BYTE PTR [esi], al
inc esi

cmp al, 0
jne @B

invoke MessageBox,0,OFFSET Buffer1,OFFSET Buffer1,MB_OK

ret
Cat endp

; #################################################
end start



RobotBob
Posted on 2003-06-29 01:38:54 by RobotBob
I better tell you the whole story. I am re-doing my app to the point where that most strings are not zero terminated. So for those strings i am going to have to learn how to do everything manually.

I'm on another trip and realize that i don't know ASM behind this. I love calling but i think i will learn more and catch up by doing things the hard way.

Either way, i in it now. I darn near striped all of those O out and is ready to play the hard way.

So this mean i got to build for each byte and can not test for zero which is where the proble is.

Do i have a chance.
Posted on 2003-06-29 01:49:28 by cmax
Well change the first cmp if its terminated with something else, if there isnt a
ending char, then do a strlen.

Yeah, you can almost invoke api or libs yourself to the
point that its almost hll-like. Sometimes that convience of *magic code* in libs leads us to forget
what its really doing.
You'll find however , 'the assembler' way, is easy. Not to menition *powerful*.

I am no master, but doing this sort of parsing in C, slowly makes the code unreadable.
Asm is such a joy.

RobotBob
Posted on 2003-06-29 01:58:24 by RobotBob
RobotBob

If i got you right, you gave me a go idea.

If i know what is in the next buffer i can use the byte as an cmp and stop it.... WoW...

So Strlen is a sure thing also.

Your code look good too. I going to learn to use olly because of these ideas.... Something nice and easy to work with.

Why do we have to call strlen for nearly everything. You got your string. What do strlen do with it. Is it realy designed for long strings or what. I always wonder about this.


PS: Those guys next door are crazy... Talking about red hats :) :) It so funny how they put things.



Thanks Guys
Posted on 2003-06-29 02:20:14 by cmax

If i know what is in the next buffer i can use the byte as an cmp and stop it.... WoW...


yeah you can have smiley terminated strings if you want to :)

cmp ax, ":)"

Yea, if you have prior knowledge of the strings length or what it contains, then a strlen isnt needed.
I think the reason is most code is made generic for reuse. Make no assumptions and the string functions are reusable. Generic and *modular* also slow things down abit, especially if you only want to add a byte to a
5 byte-string.

RobotBob
Posted on 2003-06-29 02:29:47 by RobotBob
StrLen returns the length of the string. In order to know where to place the byte I added the length of the string to it's offset, that gave me the offset of the null terminator which I overwrote. StrLen is always being used because the masm32 version is fairly fast and it's tough to write to the end of a buffer without knowing how long it is.

If you are using strings that are in the .data section you don't want to append anything to them. You will be overwriting the start of the next string. You should really consider using NULL terminated strings, by opting out you might have saved a few bytes but you have barred the whole API.
Posted on 2003-06-29 02:30:04 by donkey
donkey
ASS-embler

I never thought a mule could be so smart.... ( could not wait to say this)

donkey, suppose you know the length of the string, is ther something special you can do to by pass calling strlen. Also i never use .data section, and i normally use another buffer to do what i need to do, soon i just use the registers. Also i keep buffers big enough to take in all of a cat with no rooe to spare.

This is just a trip i am on just to see what i end up with and to get deeper into real assembler like most of you guys. I am not anti api i just want to feel every ever byte i type.

I use Jen Strlen ( the little one with the double J) and i don't plan to give that up to soon now that i understand more about strlen.

Thanks
Posted on 2003-06-29 02:58:11 by cmax
In my eaxmple just add edi,Length and remove the call to StrLen completely.

mov edi,OFFSET String1
add edi,Length
mov ,BYTE PTR "4"
mov ,BYTE PTR 0

The MASM32 version of StrLen is very fast compared to a byte scanner, it works on DWORDs and chomps the string 4 bytes at a time. about 1.5 clocks per character when I measured it,
Posted on 2003-06-29 03:03:01 by donkey
WoW . ..
Posted on 2003-06-29 03:10:02 by cmax
Is it faster or is there very little difference in small strings compare to an ultra-simple
byte scanner? On average, I never cat anything bigger that 100-500 bytes.
i only think of this because of the optimize rule: smaller != faster && faster != smaller :)

Or another way, you can't get both small and fast. Or simple and fast lol.

RobotBob
Posted on 2003-06-29 03:13:30 by RobotBob
Hi RobotBob,

It is mostly useful for strings aligned at 16 bytes and generally you notice the optimization at around 64 characters. Below that it is difficult to justify the complexity of Agner's StrLen optimization. For strings under 10 characters I use a byte scanner. I have never tested it on strings under 64 characters so I'm not sure if any time would be saved, this is the GoAsm implementation :
StrLenA FRAME item

uses ebx,edx,ecx
; -------------------------------------------------------------
; This procedure has been adapted from an algorithm written by
; Agner Fog.
; -------------------------------------------------------------

mov eax,[item] ; get pointer to string]
lea edx,[eax+3] ; pointer+3 used in the end
:
mov ebx,[eax] ; read first 4 bytes
add eax,4 ; increment pointer
lea ecx,[ebx-01010101h] ; subtract 1 from each byte
not ebx ; invert all bytes
and ecx,ebx ; and these two
and ecx,80808080h
jz < ; no zero bytes, continue loop
test ecx,00008080h ; test first two bytes
jnz >
shr ecx,16 ; not in the first 2 bytes
add eax,2
:
shl cl,1 ; use carry flag to avoid branch
sbb eax,edx ; compute length

ret
StrLenA ENDF
Posted on 2003-06-29 03:19:52 by donkey
I use to use that all the time then i don't know what really happed but it did something wrong one day. I blame my code but jen lens did the job.

OK

Then jen Len could not do the job on an centrtain string in a hell of a fuction i wrote and of Agner's lens did the job with no problem so i live with both of them. :)

I told hutch about it in a post back then.

Bottom line i would recommend that coders keep both of them around because you never know what you can come up with and get sucked by windows, masm or who really knows OUT of IT. It got me many times until i got hip as far as fighting with Windows or whatever it was ..
. It now don't stand a chANCE... The OS it self watch Win32.asm files and will F**k you Up if it see something it don't like... or maybe it was just for programmers like me. Just always back-up big time and never let it destroy you....What am i taking about.........anyway


StrLen_2 proc PROC Source:DWORD

mov ecx, Source

@@:
mov eax, dword ptr
add ecx, 4

lea edx,
xor eax, edx
and eax, 80808080h
jz @B
and eax, edx
jz @B

bsf edx, eax

sub edx, 4
shr edx, 3

lea eax,
sub eax, Source

RET

StrLen_2 endp


Clock this when you get time ... Back than they claim it was double the speed of Agner's, (but so what Don't never give it up) I witness that they both have there place.

Donkey, your final code got to be faster then them all because there is no call time included and it is only working on a byte or two or does that really matter or does LENGTH take up all the clocks.

Hutch once said small code don't mean fast code... I hope this is not the case here.
Posted on 2003-06-29 03:51:31 by cmax
LENGTH is a number that you supply, you asked for an example if you knew the length of the string at compile time. Otherwise you have to find the string length some how.
Posted on 2003-06-29 03:59:00 by donkey
I bet RobotBob, is pure lighting, it might beat even beat Jen on small 5-500 strings and thats all i really do.

Anyway i will we using both of them seriously.

Thank Guys
Posted on 2003-06-29 04:36:02 by cmax
here is a non-opt strlen i use in spasm, for small strings.



Cat proc
;Strlen
mov edi, OFFSET Buffer1
mov ecx, -1
mov al, 0
repne scasb
mov eax, -2
sub eax, ecx

; Add length to esi to move to the end
mov esi, OFFSET Buffer1
add esi, eax

mov edi, OFFSET Buffer2
@@:
mov al, BYTE PTR [edi]
inc edi

mov BYTE PTR [esi], al
inc esi

cmp al, 0
jne @B

invoke MessageBox,0,OFFSET Buffer1,OFFSET Buffer1,MB_OK
ret
Cat endp


Funny thing about non-opt assembler, I used to think I had optimized C code LOL.

RobotBob
Posted on 2003-06-29 04:59:13 by RobotBob
RobotBob, I going to try that out but I think donkey code is what i need right now because there is no call and only a few lines to do it with. I thing it would be the best thing to do for adding only 1 to 3 byte if it can be done from 2 offset byte or dword buffers.


It works as long as "AB" is there but my text is in a 2 byte buffer and I tried to add it to the text in my 14 byte buffer and i got 123456789ABC with two byte of scrble scrable.... I should have got 123456789ABCDE ... Can it be done from an offset buffer ...

mov edi,offset Buffer1
add edi,12
mov ,WORD PTR offset Buffer2 ; "DE" ; ebx
mov , WORD PTR 0

I tried a few other ways to over and over again changing things as i went but had no success.
Posted on 2003-06-29 14:45:00 by cmax
Well i was just showing the complexity difference in strlens.

the fast yet larger masmlib one:



push ebx
mov eax,item ; get pointer to string
lea edx,[eax+3] ; pointer+3 used in the end
@@:
mov ebx,[eax] ; read first 4 bytes
add eax,4 ; increment pointer
lea ecx,[ebx-01010101h] ; subtract 1 from each byte
not ebx ; invert all bytes
and ecx,ebx ; and these two
and ecx,80808080h
jz @B ; no zero bytes, continue loop
test ecx,00008080h ; test first two bytes
jnz @F
shr ecx,16 ; not in the first 2 bytes
add eax,2
@@:
shl cl,1 ; use carry flag to avoid branch
sbb eax,edx ; compute length
pop ebx

ret


Or the small but slower, and easy for beginner to understand:



;Strlen
mov edi, OFFSET Buffer1
mov ecx, -1
mov al, 0
repne scasb
mov eax, -2
sub eax, ecx



If you already have prior knowledge of string length then:

add edi,12

is the way to go.

Good Luck,

RobotBob
Posted on 2003-06-30 00:26:07 by RobotBob