Hi everyone.

can someone please tell me, is it faster to code my own conversion/string routine than to use wsprintfa?
I mean, let's say I want to join two strings together, and put the final result in another empty buffer.

String1 db "First string" 

String2 db "Second String"
Buffer db 250 dup (?)

Will it be faster to use a procedure that I make, using (lodw/stosw)? Or should i just use wsprintf ( "%s %s") ) ?

The same for coverting from hex-->string.. Is wsprintf slower than a procedure that I code?

I'm coding a program that does a lot of string manipulation & convertions (hex-> string).. Speed is very important. What do you suggest? using wsprintf or my own procedures?

Sorry for my english, I hope you got what I mean.

Posted on 2003-09-28 07:36:17 by CuTedEvil
I have the same question as yours. So I write my own routine, only implemented %s and %d now.
u can use it for reference, or maybe help to optimize.

mprintf proc c uses esi edi ebx lpOut:DWORD,lpFmt:DWORD,param:VARARG
LOCAL buf[16]:byte
LOCAL dwtmp:dword

mov esi,lpFmt
mov edi,lpOut
xor ebx,ebx
.while 1
.if al=="%"
.if al=="s"
mov eax,param[ebx]
.elseif al=="d"
m2m param[ebx],dwtmp
invoke udwtoa,dwtmp,addr buf
lea eax,buf
.elseif al=="r"
mov eax,param[ebx]
add ebx,4
mov ah,"%"

add ebx,4 ;Move param pointer

push esi
mov esi,eax
.while 1
.if al==0

pop esi
.if al==0
mprintf endp

udwtoa proc uses edx esi edi dwValue:DWORD, lpBuffer:DWORD
LOCAL buf[10]:BYTE

lea esi,buf
mov eax,dwValue
mov edi,10
.if eax<10
or al,30H
mov [esi],al
xor edx,edx
div edi
or dl,30H
mov [esi],dl
inc esi
lea ecx,buf
mov edi,lpBuffer
.while esi>=ecx
mov al,[esi]
dec esi
mov byte ptr [edi],0
udwtoa endp

Posted on 2003-09-28 07:51:04 by optimus
wsprintf isn't really written for speed - if you have critical needs, you should definitely be able to gain speed by writing your own routines. Also, don't use a generic thing like sprintf for joining two strings - write specific code for string concatenation, hex conversion, etc. I think there was some pretty optimized MMX code for hex conversion floating around here on the board a while ago...
Posted on 2003-09-28 08:53:01 by f0dder
Why not use lstrcat(........) ?
Posted on 2003-10-06 03:02:25 by ASD916
if using lstrcat, then why not just using wsprintf?
Posted on 2003-10-06 05:00:08 by optimus
Because (l)strcat is designed to concatenate two strings, which is exactly what is required. It's been design with this, and only this in mind. While wsprintf was created with the intention of formatting strings, integers, and floating point numbers. There is all the extra code in there to scan for '%' characters, and then format the appropriate pointer/data on the stack into the required text. Thats quite a bit of overhead that strcat doesn't have.

Posted on 2003-10-08 14:46:59 by Mirno
does wsprintf handle floating-point numbers nowadays?
Oh, by the way, the windows lstrcat is rather slow, you can get much higher speed doing your own.

Also, using <length>,<data> strings internally, only converting them to nul-terminated when needed
can sometimes give you a nice speed boost, instead of calling strlen all the time.
Posted on 2003-10-08 14:52:10 by f0dder
About (l)strcat.. yes it does the job. But as f0dder said, it's kinda slow. - I need to call it a LOT- . I'll search the forum for a procedure to concatenate two strings. If You can give me an example for such a procedure, i'll be very pleased, i'm not very good in assembly to procedure a speed-optimized procedure, some tips would be great.


Posted on 2003-10-09 06:25:36 by CuTedEvil

Have a look at the two version in the MASM32 library, either will do the job of concantenating strings for you. The version called "szMultiCat" will concantenate more than 2 strings at a time and it is fast as well.

If you have time, you may find another algo in the algorithm section.

Posted on 2003-10-09 08:39:11 by hutch--
CuTedEvil, what kind of work do you do on the strings, and must you often pass them back and forth between API functions? As I said previously, you might be able to save a lot of work if you can keep pairs of "length, stringdata" instead of having to call strlen all the time...
Posted on 2003-10-09 08:46:40 by f0dder

Actually your idea is good. Perhaps I will play with it. Thanks. :alright:
Posted on 2003-10-09 09:11:00 by roticv
it's such a simple yet such an efficient optimization. It's been done for ages in pascal strings, C++ std::string, VB (I think), java (I think), and the .net environment too, if I'm not mistaken.

What's better than having the fastest strlen?
Not using strlen at all ;)
Posted on 2003-10-09 09:15:13 by f0dder
@hutch--: Thanks. I'll check them, and i'll search the forum :alright:

@f0dder :CuTedEvil, what kind of work do you do on the strings, and must you often pass them back and forth between API functions? Well, it's a disassembler, and I do alot of string concantenating in it. Your idea seems nice, I'll check urz as well. Thanks

Posted on 2003-10-09 15:20:34 by CuTedEvil
this is what i made for a replacement of lstrcat in my lib im making

StrApn Proc Dest:DWORD,Src:DWORD
push esi
push edi
mov esi,Src
mov edi,Dest
xor eax,eax
cmp byte ptr ,00h
je EndChk
add eax,01h
add edi,01h
jmp StartChk

cmp byte ptr ,00h
je EndCpy
add eax,01h
jmp StartCpy
mov byte ptr ,00h
pop edi
pop esi
StrApn endp
Posted on 2003-10-09 15:30:35 by devilsclaw
the above can be optimized more if you are going to make the code only work with .686 cpus since the have some nice opcode's that replace three of the old opcodes in one..
Posted on 2003-10-09 15:36:47 by devilsclaw
Weird code. I would do something like the following, but if the string is long enough, some mmx opcodes could be used to speed it up.

mov ecx, [esp+4]
mov edx, [esp+8]
cmp byte ptr[ecx],0
lea ecx, [ecx+1]
jnz @B
dec ecx
mov al, [edx]
add edx, 1
mov [ecx], al
add ecx, 1
or al, al
jnz @B
mov eax, ecx
sub eax, [esp+4]
retn 8

the above can be optimized more if you are going to make the code only work with .686 cpus since the have some nice opcode's that replace three of the old opcodes in one..

Like for example?
Posted on 2003-10-10 02:17:33 by roticv
actually I was thinking of applying it to something else the 686 thing.. it might be also possible to add them in to this but i have to look at more of them...

here is a slight change to the code you posted

dec and inc are slower then add and sub according to the intel book optimization..

the only thing is, is that your code does not null terminate after it appends the two strings.. incase you use the same buffer twice..

also yours changes edx and ecx and eax.. mine returned in the eax and restored the edi and esi so that only eax was changed..

mov edi,
mov esi,
cmp byte ptr,0
lea edi,
jnz @B
sub edi,01h
cmp byte ptr ,00h
jnz @B
mov byte ptr ,00h :added during edit..
mov eax, edi
sub eax,
retn 8

EDIT: I guess your code did null terminate since it checked al after it moved the byte over...
Posted on 2003-10-10 03:09:44 by devilsclaw
dec and inc are slower then add and sub according to the intel book optimization..

hmm, perhaps you did not notice that I used dec ecx not within any critical code (ie some loops etc). So, the speed differences does not really matter (At least I saved some bytes :grin: ).

If you use esi and edi, you would need to preserve them (and thus alter some codes). edx and ecx are expendable and also try to avoid using string opcodes because they are significantly slower than its equivent loops.

The only optimisation I can see is perhaps changing the "strlen" code so that it scans dword at a time (ie using anger's code or jan's) or if the string is long enough, mmx.
Posted on 2003-10-10 03:50:17 by roticv
Those wanting to do wsprintf-like operations, and those wanting a faster string format than z-strings might want to seriously take a look at the HLA Standard Library (which is now available for MASM32 users at http://webster.cs.ucr.edu/Page_AdvAsm/0_HLA4MASM.html).

First of all, HLA uses a string record format that incorporates the length (any practical size, not just 255 characters) and other information into the string format. It is also upwards compatible with zero-terminated strings, so you can still pass HLA string data to Win32 API functions and other functions that read z-string data. HLA's string library routines also check for things like overflow and bad index values, helping you produce more robust code (yep, there is a price to pay for these checks, but none of them are in the critical paths, so it's a very *small* price to pay).

As for the desire to call wsprintf- the HLA Standard Library provides a large set of routines like str_cati32 that automatically converts a 32-bit integer to a string and concatenates that to some other string (there are routines like this for just about every common data type you can imagine). By making a sequence of calls to these routines, you can do the same work as wsprintf, without the interpretive overhead (then again, wprintf calls are usually shorter as you only need one call to convert multiple operands).

In any case, if you're interested, check out the HLA Standard Library for MASM32 users at the above URL.
Randy Hyde
Posted on 2003-10-10 16:15:41 by rhyde

I have noticed that u are secretly trying to force ppl using HLA in ur posts/threads...
and i still don't get why? ppl will use it if THEY CHOOSE TOO! thats about it.
what is wrong than normal asm? HLA looks way uglier, even if it is 'easier' to code.
i would prefer using standrd assembler and not trying to tell ppl to use hla like it is the only way to code in asm just cuz it has HLL alike syntax.

CutedEvil wants help, examples.. why not giving him a source code? trying improve his, advices...
heck if i say to him: hey CutedEvil, i know VB has a really good trim() function..look there, hm.. u get my point? ;)

sleep on it mate.
Posted on 2003-10-10 18:22:22 by wizzra