Hi,
I'm very new to this and after reading a few tutorials (thanks a lot for the exagone-tuts!) I now wanted to get a little practical.
So this is the first task I gave me. Writing an UrlDecode-DLL.
It took me a while but now it works !
But to me it looks very unstructured.
As you may see, I know very little opcodes and know very little about them. I'd really like to get to know better Assembler than that. Please teach me how to!
There are a few question:
Posted on 2001-09-05 07:24:26 by Butch77
I'm very new to this and after reading a few tutorials (thanks a lot for the exagone-tuts!) I now wanted to get a little practical.
So this is the first task I gave me. Writing an UrlDecode-DLL.
It took me a while but now it works !
But to me it looks very unstructured.
As you may see, I know very little opcodes and know very little about them. I'd really like to get to know better Assembler than that. Please teach me how to!
There are a few question:
[*]Is there an easyer way to convert 2 HexCars to a number?
[*]Does that "cld" have any relevance in this context?
[*]Is there a good way to take advantage of the "rep" command
[*]I feel very limited with creating a string of fixed length; Is there any other way but using the SysAllocStringByteLen?
[*]Is it better to do the HexConversion like this or with more jumping?
Thanks
UrlDecode proc uses ecx , text:DWORD
LOCAL ReturnString$ :DWORD
LOCAL ln :DWORD
LOCAL Occurencys :DWORD
push esi ;save esi and edi
push edi
mov ecx, text
xor eax, eax
StartLengthLoop:
mov dl, [ecx]
inc ecx ;GetStringLength
cmp dl, "%"
jne CharNotFound
inc eax ;Count Occurency of "%"
CharNotFound:
cmp dl, 0
jne StartLengthLoop
mov Occurencys, eax ;save "%"-Count as Occurencys
sub ecx, text ;correct StringLength
dec ecx
shl eax, 1 ;calculate NewStringLength
sub ecx, eax
jns DontZeroEcx ;end if negative
mov ecx, 0
DontZeroEcx:
mov ln, ecx
invoke SysAllocStringByteLen,0,ln ;Allocate Space for the resultString
mov ReturnString$, eax
cmp ln, 0 ;End if Length is 0
je TheEnd
mov esi, text ;Set Pointers to StringVars
mov edi, ReturnString$
mov ecx, ln ;set ecx to StringLength
cld
add ln, edi
cmp Occurencys, 0 ;Check whether there need to be made any replacements
je NoHex
StartLoop: ;Start of the ReplacmentLoop
mov al, [esi] ;copy current char
cmp al, "%" ;compare to "%"
jne CharNotFound2
xor eax, eax
inc esi
mov ax, [esi] ;Get the 2 HexChars
mov dx, ax ;Convert To Hex
and dx, 16448 ;01000000 01000000
shr dx, 3
add ax, dx
shr dx, 3
add ax, dx
and eax, 3855 ;00001111 00001111
shl al, 4
add al, ah
mov [edi], al
inc esi
jmp EndLoop
CharNotFound2:
cmp al, "+" ;if regular Char
je ReplacePlus
mov [edi], al
jmp EndLoop
ReplacePlus: ;replace "+" with space
mov al, 32
mov [edi], al
EndLoop:
inc esi ;StringPointer erh?hen
inc edi
cmp ln, edi ;end if OutputStringLength reached
jg StartLoop
jmp TheEnd
NoHex: ;No "%" replaceing necessary
;rep movsb
mov al, [esi]
cmp al, "+" ;Replacing of "+"s only
je ReplacePlus2
mov [edi], al
jmp NoPlus
ReplacePlus2:
mov al, 32
mov [edi], al
NoPlus:
inc esi
inc edi
cmp ln, edi
jg NoHex
jmp TheEnd
TheEnd:
mov al, 0 ;terminate String
mov [edi], al
mov eax, ReturnString$ ;return String
pop edi
pop esi
ret
UrlDecode endp
Posted on 2001-09-05 07:24:26 by Butch77
1) Is there a better way to convert to ascii characters representing a hex value to that char? No, this is a very good way of doing it!
2) cld is the mnemonic to clear the direction flag in the processor.
This usually determines whether to increment, or decrement esi and/or edi when using certain commands (see lods, stos, movs as examples). In this context it performs no function.
When using lods/stos/movs it is desireable to move forwards, as it is faster due to caching reasons.
3) rep not really suitable, as it will repeatedly apply the certain instructions it applys to, and can only be used in conjunction with one instruction at a time. You need to analyse your results after each fetch from memory.
4) For programming flexibility I would advise that the caller of your function provides the output buffer, this makes your code more usable, and makes things easier for you to program!
5) As a general rule, jumping is bad! Jumps can cause the processor to stall, which will impact on performance. For a better description of the problems with jumps read the agner fog help file provided with MASM32, reading the sections on branch prediction.
Here is my attempt at the problem, it takes as input two arguments, the second being a pointer to the return buffer.
Mirno
2) cld is the mnemonic to clear the direction flag in the processor.
This usually determines whether to increment, or decrement esi and/or edi when using certain commands (see lods, stos, movs as examples). In this context it performs no function.
When using lods/stos/movs it is desireable to move forwards, as it is faster due to caching reasons.
3) rep not really suitable, as it will repeatedly apply the certain instructions it applys to, and can only be used in conjunction with one instruction at a time. You need to analyse your results after each fetch from memory.
4) For programming flexibility I would advise that the caller of your function provides the output buffer, this makes your code more usable, and makes things easier for you to program!
5) As a general rule, jumping is bad! Jumps can cause the processor to stall, which will impact on performance. For a better description of the problems with jumps read the agner fog help file provided with MASM32, reading the sections on branch prediction.
Here is my attempt at the problem, it takes as input two arguments, the second being a pointer to the return buffer.
URLDecode PROC USES esi edi text:DWORD, output:DWORD
mov esi, text
mov edi, output
loop_start:
mov al, BYTE PTR [esi]
inc esi
cmp al, '%'
je ascii_convert
cmp al, '+'
jne @F
mov al, ' '
;You could replace this with a cmov instruction if you
;target a 686, this would remove the conditional jump
@@:
mov BYTE PTR [edi], al
inc edi
cmp al, 0
jne loop_start
ret
ascii_convert:
mov ax, WORD PTR [esi]
add esi, 2
mov cx, ax
and ax, 4040h
and cx, 0F0Fh
shr ax, 3
add ax, cx
shr ax, 3
add ax, cx
shl al, 4
or al, ah
mov [edi], al
inc edi
jmp loop_start
URLDecode endp
Mirno
Hey Thanks a lot! That looks much better...
The intention of programming this function was to use it in ASP later. That's why I tried to use the decoded string as returnvalue.
To use it like that:
Do you have any idea on how to solve this problem?
Thanks :)
The intention of programming this function was to use it in ASP later. That's why I tried to use the decoded string as returnvalue.
To use it like that:
response.write UrlDecode(strEncoded)
Do you have any idea on how to solve this problem?
Thanks :)
If you don't mind passing the function the address of the buffer, you could have it return a pointer to that function too!
Keeping with the code that I wrote above, simply add this line before the ret command.
It is more versatile to deal with the output this way, and can help avoid problems. You cannot manage the memory as easily if the function allocates the space, as details on how that memory is allocated (and how much) is potentially unknown to the caller.
Mirno
Keeping with the code that I wrote above, simply add this line before the ret command.
mov eax, output
It is more versatile to deal with the output this way, and can help avoid problems. You cannot manage the memory as easily if the function allocates the space, as details on how that memory is allocated (and how much) is potentially unknown to the caller.
Mirno
But there is no 'easy' way to create a function like this without using an 'output' parameter!?
You can do as you are doing, ie work out how much space you will need, and then get the function to allocate the space.
The problem with this is that the code external to the function has no control over the allocation. This means it cannot de-allocate it when it has finished with it, nor can it re-use the buffer once it has finished with the data contained within it. Also if you call the function enough times, eventually you will run out of memory because all the other instances of the data are still floating around.
If you truely want to allocate the memory on the fly within the function, then the way that you did it is probably the best.
You can always look at alternative memory allocation methods, but the result is pretty much the same!
Use an Win32 API referance and look up GlobalAlloc, & HeapAlloc.
I've never used asp, so I don't really understand how your exact problem lies. I would assume you could create some buffer, as it is a staple of most languages!
How do you create the "strEncoded" variable? Could you not copy it, and overwrite the copy with the new results?
Mirno
The problem with this is that the code external to the function has no control over the allocation. This means it cannot de-allocate it when it has finished with it, nor can it re-use the buffer once it has finished with the data contained within it. Also if you call the function enough times, eventually you will run out of memory because all the other instances of the data are still floating around.
If you truely want to allocate the memory on the fly within the function, then the way that you did it is probably the best.
You can always look at alternative memory allocation methods, but the result is pretty much the same!
Use an Win32 API referance and look up GlobalAlloc, & HeapAlloc.
I've never used asp, so I don't really understand how your exact problem lies. I would assume you could create some buffer, as it is a staple of most languages!
How do you create the "strEncoded" variable? Could you not copy it, and overwrite the copy with the new results?
Mirno