Hello,
i tried to write a StrReverse proc. What do you think about?
And a very small version:
Have a nice day, Manu.
i tried to write a StrReverse proc. What do you think about?
strrev proc uses ESI EDI lpString: LPSTR
MOV ESI, lpString
invoke lstrlen, ESI
MOV ECX, EAX
LEA EDI, [ESI+ECX-1] ; last byte of lpString
PUSH ECX ; strlen
SHR ECX, 3 ; strlen / 8 = 2 DWORD blocks
TEST ECX, ECX
JE @@next
LEA EDI, [EDI-3] ; ptr to last DWORD block
; copy 4 Bytes from ESI, 4 Bytes from EDI, swap both reg, ESI <-> EDI
@@loop4:
MOV EAX, [ESI]
MOV EDX, [EDI]
BSWAP EAX
BSWAP EDX
MOV [ESI], EDX
MOV [EDI], EAX
LEA ESI, [ESI+4]
LEA EDI, [EDI-4]
DEC ECX
JNZ @@loop4
@@next:
POP ECX ; POP strlen
AND ECX, 7
SHR ECX, 1
TEST ECX, ECX
JE @@exit
@@loop1:
MOV AL, [ESI]
MOV AH, [EDI]
MOV [EDI], AL
MOV [ESI], AH
INC ESI
DEC EDI
DEC ECX
JNZ @@loop1
@@exit:
ret
strrev endp
And a very small version:
strrev2 proc uses ESI EDI lpString: LPSTR
MOV ESI, lpString
invoke lstrlen, ESI
LEA EDI, [ESI+EAX-1]
@@loop:
CMP ESI, EDI
JL @@exit
MOV AL, [ESI]
MOV AH, [EDI]
MOV [ESI], AH
MOV [EDI], AL
INC ESI
DEC EDI
JMP @@loop
@@exit:
ret
strrev2 endp
Have a nice day, Manu.
Hi other,
might be faster to do it in DWORDs, the main part would be :
Where ECX is the string length, ESI is the source and EDI is the destination. You will have to figure out how to handle the remainders, I have never needed the function so I never put much thought into it.
might be faster to do it in DWORDs, the main part would be :
mov edi,[lpDest]
mov esi,[lpSource]
mov ecx,[nBytes]
add esi,ecx
sub esi,4
shr ecx,2
:
mov eax,[esi]
bswap eax
mov [edi],eax
sub esi,4
add edi,4
dec ecx
jnz <
Where ECX is the string length, ESI is the source and EDI is the destination. You will have to figure out how to handle the remainders, I have never needed the function so I never put much thought into it.
Hello,
might be faster to do it in DWORDs, the main part would be :
Look ahead :-). For the fast algo, i use DWORD's and copy 2 DWORD's at 'once' :).
Lea ESI, could be better?
SHR ECX, 3 and copy 2 Blocks (head to foot and vice versa) ...
Have a nice day, Manuel.
might be faster to do it in DWORDs, the main part would be :
Look ahead :-). For the fast algo, i use DWORD's and copy 2 DWORD's at 'once' :).
add esi,ecx
sub esi,4
Lea ESI, could be better?
shr ecx,2
SHR ECX, 3 and copy 2 Blocks (head to foot and vice versa) ...
Have a nice day, Manuel.
Hi other,
I was commenting on the second (short version), I didn't really look at the first (it was a bit long for something I am not likely to use). I agree that lea would be better in the example, I didn't actually think too much about it, just sort of typed it.
I was commenting on the second (short version), I didn't really look at the first (it was a bit long for something I am not likely to use). I agree that lea would be better in the example, I didn't actually think too much about it, just sort of typed it.
Hello,
I was commenting on the second (short version), I didn't really look at the first (it was a bit long for something I am not likely to use). I agree that lea would be better in the example, I didn't actually think too much about it, just sort of typed it.
Sorry. I misunderstood you. :-)
Have a nice day, Manuel.
I was commenting on the second (short version), I didn't really look at the first (it was a bit long for something I am not likely to use). I agree that lea would be better in the example, I didn't actually think too much about it, just sort of typed it.
Sorry. I misunderstood you. :-)
Have a nice day, Manuel.
other,
"And a very small version:.."
here is smaller one
just 25 bytes...
Regards,
Lingo
"And a very small version:.."
here is smaller one
just 25 bytes...
OPTION PROLOGUE:NONE ; turn it off
OPTION EPILOGUE:NONE ;
StrRev proc lpString:DWORD ;
;
pop edx ; edx->return address
pop eax ; eax->lpString
push edx ; edx->return address
push esi ; save esi
cld ; clears the Direction Flag
xor esi, esi ; esi = 0
push edi ; save edi
mov edi, eax ; edi->lpString
xchg eax, esi ; esi->lpString; eax = 0
L_1: ; saving the string in the stack
push eax ;
lodsb ; mov al, [esi] -> inc esi
test eax, eax ; is it end of the string?
jne L_1 ;
L_2: ; restoring the string from the stack
pop eax ;
stosb ; mov [edi], al -> inc edi
dec eax ; is it end of the string from the stack?
jns L_2 ;
pop edi ; restore esi and edi
pop esi ;
ret ; 25 bytes
StrRev endp ;
OPTION PROLOGUE:PROLOGUEDEF ; turn back on the defaults
OPTION EPILOGUE:EPILOGUEDEF ;
Regards,
Lingo
Hello,
other,
"And a very small version:.."
here is smaller one
just 25 bytes...
:cool:
Cool. Nice idea to use the stack :-). Do you think, there is a bottleneck in the first version?
Regards Manuel.
other,
"And a very small version:.."
here is smaller one
just 25 bytes...
xchg eax, esi ; esi->lpString; eax = 0
L_1: ; saving the string in the stack
push eax ;
lodsb ; mov al, [esi] -> inc esi
test eax, eax ; is it end of the string?
jne L_1 ;
L_2: ; restoring the string from the stack
pop eax ;
stosb ; mov [edi], al -> inc edi
dec eax ; is it end of the string from the stack?
jns L_2 ;
:cool:
Cool. Nice idea to use the stack :-). Do you think, there is a bottleneck in the first version?
Regards Manuel.
"Do you think, there is a bottleneck in the first version?"
- "standard" lstrlen is slow and we can skip it here
- LEA EDI, -> lea is slow instruction in P4
- ;copy 4 Bytes from ESI, 4 Bytes from EDI, swap both reg, ESI <-> EDI
slow because EDI and ESI are not DD alligned;
additional clocks for BSWAP;
wil be faster to read/write single bytes
- DEC ECX -> dec/inc is slow instruction in P4
- SHR ECX, 3 -> shr is slow instruction in P4
Regards,
Lingo
- "standard" lstrlen is slow and we can skip it here
- LEA EDI, -> lea is slow instruction in P4
- ;copy 4 Bytes from ESI, 4 Bytes from EDI, swap both reg, ESI <-> EDI
slow because EDI and ESI are not DD alligned;
additional clocks for BSWAP;
wil be faster to read/write single bytes
- DEC ECX -> dec/inc is slow instruction in P4
- SHR ECX, 3 -> shr is slow instruction in P4
Regards,
Lingo