Hi,
I'm very new to this and after reading a few tutorials (thanks a lot for the exagone-tuts!) I now wanted to get a little practical.
So this is the first task I gave me. Writing an UrlDecode-DLL.
It took me a while but now it works !
But to me it looks very unstructured.
As you may see, I know very little opcodes and know very little about them. I'd really like to get to know better Assembler than that. Please teach me how to!
There are a few question:

    [*]Is there an easyer way to convert 2 HexCars to a number?
    [*]Does that "cld" have any relevance in this context?
    [*]Is there a good way to take advantage of the "rep" command
    [*]I feel very limited with creating a string of fixed length; Is there any other way but using the SysAllocStringByteLen?
    [*]Is it better to do the HexConversion like this or with more jumping?


    Thanks

    
    
    UrlDecode proc uses ecx , text:DWORD
    LOCAL ReturnString$ :DWORD
    LOCAL ln :DWORD
    LOCAL Occurencys :DWORD

    push esi ;save esi and edi
    push edi

    mov ecx, text
    xor eax, eax

    StartLengthLoop:
    mov dl, [ecx]
    inc ecx ;GetStringLength
    cmp dl, "%"
    jne CharNotFound
    inc eax ;Count Occurency of "%"
    CharNotFound:
    cmp dl, 0
    jne StartLengthLoop

    mov Occurencys, eax ;save "%"-Count as Occurencys
    sub ecx, text ;correct StringLength
    dec ecx
    shl eax, 1 ;calculate NewStringLength
    sub ecx, eax

    jns DontZeroEcx ;end if negative
    mov ecx, 0
    DontZeroEcx:

    mov ln, ecx
    invoke SysAllocStringByteLen,0,ln ;Allocate Space for the resultString
    mov ReturnString$, eax

    cmp ln, 0 ;End if Length is 0
    je TheEnd

    mov esi, text ;Set Pointers to StringVars
    mov edi, ReturnString$
    mov ecx, ln ;set ecx to StringLength
    cld

    add ln, edi
    cmp Occurencys, 0 ;Check whether there need to be made any replacements
    je NoHex

    StartLoop: ;Start of the ReplacmentLoop
    mov al, [esi] ;copy current char
    cmp al, "%" ;compare to "%"
    jne CharNotFound2

    xor eax, eax
    inc esi
    mov ax, [esi] ;Get the 2 HexChars

    mov dx, ax ;Convert To Hex
    and dx, 16448 ;01000000 01000000
    shr dx, 3
    add ax, dx
    shr dx, 3
    add ax, dx
    and eax, 3855 ;00001111 00001111
    shl al, 4
    add al, ah

    mov [edi], al
    inc esi
    jmp EndLoop
    CharNotFound2:
    cmp al, "+" ;if regular Char
    je ReplacePlus
    mov [edi], al
    jmp EndLoop
    ReplacePlus: ;replace "+" with space
    mov al, 32
    mov [edi], al
    EndLoop:
    inc esi ;StringPointer erh?hen
    inc edi
    cmp ln, edi ;end if OutputStringLength reached
    jg StartLoop
    jmp TheEnd
    NoHex: ;No "%" replaceing necessary
    ;rep movsb
    mov al, [esi]
    cmp al, "+" ;Replacing of "+"s only
    je ReplacePlus2
    mov [edi], al
    jmp NoPlus
    ReplacePlus2:
    mov al, 32
    mov [edi], al
    NoPlus:
    inc esi
    inc edi
    cmp ln, edi
    jg NoHex
    jmp TheEnd
    TheEnd:
    mov al, 0 ;terminate String
    mov [edi], al

    mov eax, ReturnString$ ;return String

    pop edi
    pop esi
    ret
    UrlDecode endp


    Posted on 2001-09-05 07:24:26 by Butch77
1) Is there a better way to convert to ascii characters representing a hex value to that char? No, this is a very good way of doing it!

2) cld is the mnemonic to clear the direction flag in the processor.
This usually determines whether to increment, or decrement esi and/or edi when using certain commands (see lods, stos, movs as examples). In this context it performs no function.
When using lods/stos/movs it is desireable to move forwards, as it is faster due to caching reasons.

3) rep not really suitable, as it will repeatedly apply the certain instructions it applys to, and can only be used in conjunction with one instruction at a time. You need to analyse your results after each fetch from memory.

4) For programming flexibility I would advise that the caller of your function provides the output buffer, this makes your code more usable, and makes things easier for you to program!

5) As a general rule, jumping is bad! Jumps can cause the processor to stall, which will impact on performance. For a better description of the problems with jumps read the agner fog help file provided with MASM32, reading the sections on branch prediction.

Here is my attempt at the problem, it takes as input two arguments, the second being a pointer to the return buffer.



URLDecode PROC USES esi edi text:DWORD, output:DWORD

mov esi, text
mov edi, output

loop_start:
mov al, BYTE PTR [esi]
inc esi
cmp al, '%'
je ascii_convert

cmp al, '+'
jne @F
mov al, ' '
;You could replace this with a cmov instruction if you
;target a 686, this would remove the conditional jump
@@:

mov BYTE PTR [edi], al
inc edi

cmp al, 0
jne loop_start

ret

ascii_convert:
mov ax, WORD PTR [esi]
add esi, 2
mov cx, ax
and ax, 4040h
and cx, 0F0Fh
shr ax, 3
add ax, cx
shr ax, 3
add ax, cx
shl al, 4
or al, ah
mov [edi], al
inc edi
jmp loop_start

URLDecode endp


Mirno
Posted on 2001-09-06 11:22:15 by Mirno
Hey Thanks a lot! That looks much better...

The intention of programming this function was to use it in ASP later. That's why I tried to use the decoded string as returnvalue.
To use it like that:
response.write UrlDecode(strEncoded)

Do you have any idea on how to solve this problem?

Thanks :)
Posted on 2001-09-07 04:01:18 by Butch77
If you don't mind passing the function the address of the buffer, you could have it return a pointer to that function too!

Keeping with the code that I wrote above, simply add this line before the ret command.
  mov eax, output 


It is more versatile to deal with the output this way, and can help avoid problems. You cannot manage the memory as easily if the function allocates the space, as details on how that memory is allocated (and how much) is potentially unknown to the caller.

Mirno
Posted on 2001-09-07 06:57:11 by Mirno
But there is no 'easy' way to create a function like this without using an 'output' parameter!?
Posted on 2001-09-07 08:53:53 by Butch77
You can do as you are doing, ie work out how much space you will need, and then get the function to allocate the space.
The problem with this is that the code external to the function has no control over the allocation. This means it cannot de-allocate it when it has finished with it, nor can it re-use the buffer once it has finished with the data contained within it. Also if you call the function enough times, eventually you will run out of memory because all the other instances of the data are still floating around.

If you truely want to allocate the memory on the fly within the function, then the way that you did it is probably the best.

You can always look at alternative memory allocation methods, but the result is pretty much the same!
Use an Win32 API referance and look up GlobalAlloc, & HeapAlloc.

I've never used asp, so I don't really understand how your exact problem lies. I would assume you could create some buffer, as it is a staple of most languages!
How do you create the "strEncoded" variable? Could you not copy it, and overwrite the copy with the new results?

Mirno
Posted on 2001-09-07 09:12:24 by Mirno