Hi!

I made some assembler compression and decompression code available. It is small, very fast, and compresses quite well. I released it under the zlib license.

If the attach does not work, it's also at: http://home19.inet.tele.dk/jibz/files/uflz20021112.zip
Posted on 2002-11-12 07:33:15 by Jibz
However, there is one problem tho? Where is the masm code? :o
Hehe, Well I guess the conversion cant be that complicated? :alright:
Posted on 2002-11-12 07:53:45 by natas
The benefit of the NASM code is that it will work with other linkers as well. The code also works with DJGPP or on linux. E.g. you can assemble it using:

nasm -f win32 <sourcefile>

and link it into your code using MS link. But yes, it should be fairly easy to translate to MASM too :-)

Here are a few statistics in case somebody is wondering how it compares. I compared with LZOP -1 and comrades lz77 implementation:

calgary corpus:

lz77: 7.8 sec -> 2,251,910
lzop: 1.0 sec -> 1,582,455
uflz: 1.5 sec -> 1,296,357

canterbury corpus:

lz77: 5.0 sec -> 1,402,493
lzop: 0.9 sec -> 1,151,925
uflz: 1.2 sec -> 959,429

gcc-2.7.1 source code:

lz77: 58.5 sec -> 17,995,196
lzop: 4.0 sec -> 11,040,001
uflz: 7.8 sec -> 8,203,678

netscape.exe (from ACT):

lz77: 6.9 sec -> 2,114,336
lzop: 1.1 sec -> 1,801,764
uflz: 1.7 sec -> 1,639,410
Posted on 2002-11-12 14:21:13 by Jibz
Impressive Jibz - as always! I will see what I
can do with the decompressor (size wise). :)
Posted on 2002-11-12 14:40:58 by bitRAKE
Sounds great! .. here is a 107 byte version to get you started ;-)
Posted on 2002-11-12 15:36:48 by Jibz
Here is a new package, including the 107 byte decompressor and a 352 byte version of the compression code. I also added makefiles for Borland C++ and GCC on Linux/FreeBSD/BeOS/QNX.

Still no MASM version, though .. I might make one later :-)
Posted on 2002-11-13 05:04:43 by Jibz
Very nice, Jibz... :)
Thanks for sharing...
Does this beat apLib ?

PS: Do you have any news about the delta format we talked about there is some week ? :cool:

Regards,
Posted on 2002-11-13 07:30:59 by JCP
Hi Readiosys!

It beats aPLib on compression speed, but the ratios are not as good. Below is the table again, with the addition of the results for aPLib v0.36, and my current development code (ffce) at level 1.

I am still working on the delta code. I think it's going to be quite good when I get a little more work done on it .. I'll e-mail you when I've got something more solid :-)

calgary corpus:

lz77: 7.8 sec -> 2,251,910
lzop: 1.0 sec -> 1,582,455
uflz: 1.5 sec -> 1,296,357
apl : 137.5 sec -> 1,115,349
ffce: 10.8 sec -> 1,087,644

canterbury corpus:

lz77: 5.0 sec -> 1,402,493
lzop: 0.9 sec -> 1,151,925
uflz: 1.2 sec -> 959,429
apl : 111.3 sec -> 774,661
ffce: 7.2 sec -> 763,029

gcc-2.7.1 source code:

lz77: 58.5 sec -> 17,995,196
lzop: 4.0 sec -> 11,040,001
uflz: 7.8 sec -> 8,203,678
apl : 797.5 sec -> 7,212,418
ffce: 63.7 sec -> 6,253,056

netscape.exe (from ACT):

lz77: 6.9 sec -> 2,114,336
lzop: 1.1 sec -> 1,801,764
uflz: 1.7 sec -> 1,639,410
apl : 89.2 sec -> 1,351,048
ffce: 11.0 sec -> 1,346,094
Posted on 2002-11-13 09:50:13 by Jibz
What are the compressors in that list?
Posted on 2002-11-13 15:17:45 by comrade
comrade:

lz77 - comrade lz77 implementation
lzop - Markus FXJ Oberhumer lzop file compressor
apl - Jibz aPLib
ffce - Jibz new unreleased supa-duppa compressor :)
uflz - again Jibz small compressor (this thread creation reason)

...
Posted on 2002-11-14 00:28:08 by TBD
I'm down to 92 with small change to compression algo (FASM code):
uflz_depack_asm_tiny:

pushad
mov ebp, esp
mov esi, [ebp + (8+1)*4] ; source
mov edi, [ebp + (8+2)*4] ; dest
mov ebx, [ebp + (8+3)*4] ; length
; cld ; not needed on my system
xor eax, eax
add ebx, edi
.literal:
movsb
.nexttag:
cmp edi, ebx
jnc done

call word getbit
jnc .literal

call word getgamma ; high pos
lea edx, [ecx-2]

call word getgamma ; len
shl edx, 8

mov dl, [esi] ; low pos
inc esi

not edx ;= inc edx | neg edx
push esi
lea esi, [edi + edx]
rep movsb
movsb
movsb
pop esi
jmp .nexttag

getbit: add eax, eax
jne .A
lodsd
add eax, eax
inc eax
.A: ret

getgamma:
xor ecx, ecx
inc ecx
.A: call word getbit
adc ecx, ecx
call word getbit
jc .A
ret

done:
sub edi, [ebp + (8+2)*4]
mov [ebp + (8+3)*4], edi
popad
ret 8

; Usage:
; push length
; push destination
; push source
; call uflz_depack_asm_tiny
;
; pop unpacked_length

; Changes to compression:
; - reverse order of matchpos, matchlen store
; (store matchlen, then matchpos)
I haven't had time to test this, yet.
Posted on 2002-11-14 09:53:52 by bitRAKE
Thanks for the nice optimisations, bitRAKE! I especially liked the way you got rid of the 0x8000000 :-)

I was not able to get the 'call word label' trick to work -- the program crashed (Win32).

Attached is my latest version, compression = 706/342 bytes, and decompression = 146/99 bytes. I removed the uninitialised data, and use stack variables and a user supplied workmem instead.
Posted on 2002-11-14 14:43:11 by Jibz
I was playing around with converting the code to MASM style, from which I learnt 2 things:


    [*]@@: style labels are not local to macros
    [*]when assembling to coff format, ml.exe does something for each element in an uninitialised array, which means that assembling:
    .386
    

    .model flat,stdcall

    .data?

    buffer dd 4*65536 dup (?) ; 1 mb of workmem

    end

    takes 1 minute. Assembling to omf format does not have this problem.


    Attached is the latest version, compression = 646/328 bytes, and decompression = 146/99 bytes. I added a conversion to 16-bit 8086 assembler (which I guess is of little interest to the Win32ASM community ;-).
Posted on 2002-11-16 10:03:49 by Jibz