Hi! I want to convert the following piece of pascal code to asm:



V := Round(255 * Power(I / 255, GammaValue / 255));


I decided to try the FpuLib (it's included in the masm32 lib) because I didn't know how to do a Power in asm. I came up with a piece of code that doesn't work. Here it is:



[COLOR=green];Pascal v:=round(255* POWER(I/255 , GammaValue/255))
;
;I and V are dword size integers in the .data? section and
;a255 = "a255 REAL8 255.0"[/COLOR]

[COLOR=green];GammaValue/255[/COLOR]
invoke FpuDiv,addr GammaValue,offset a255,offset temp,SRC1_DMEM or SRC2_DMEM
[COLOR=green];fpu=I/255[/COLOR]
invoke FpuDiv,offset I ,offset a255,0 ,SRC1_DMEM or SRC2_DMEM or DEST_FPU
[COLOR=green];fpu=mypow(fpu)[/COLOR]
invoke FpuXexpY,0 ,offset temp,0 ,SRC1_FPU or SRC2_DMEM or DEST_FPU
[COLOR=green];fpu=255* fpu[/COLOR]
invoke FpuMul,0 ,offset a255,0 ,SRC1_FPU or SRC2_DMEM or DEST_FPU
[COLOR=green];V=round(fpu)[/COLOR]
invoke FpuRound,0 ,offset V ,SRC1_FPU or SRC2_DMEM or DEST_IMEM




What's wrong with the code? Would it be easier to do it without the fpulib?

Thanks.
Posted on 2003-05-10 07:19:23 by Delight
Does this work?
; V := Round(255 * Power(I / 255, GammaValue / 255));


fld a255

fld GammaValue
fdiv st, st(1)

fld I
fdiv st, st(2)

; Y^x st = Y, st(1) = x
fyl2x ; x*log2Y
f2xm1 ; Y^x-1
fld1 ; 1,Y^x-1
fadd ; Y^x

fmul

fistp V ; Round
Posted on 2003-05-10 09:31:59 by bitRAKE
Hi,
it almost works. I changed all flds to filds because I,GammaValue and a255 are all dword integers, but I still get some strange negative numbers :confused: Maybe I'm doing something wrong somewhere else
Posted on 2003-05-10 10:36:08 by Delight
I wrote a little test:
_DATA SEGMENT

a255 WORD 255
GammaValue WORD 0
I WORD 1 ; can't be zero
V WORD ?
_DATA ENDS

; V := Round(255 * Power(I / 255, GammaValue / 255));
xor eax, eax
@@:
fild a255

fild GammaValue
fdiv st, st(1)

fild I
fdiv st, st(2)

; Y^x st = Y, st(1) = x
fyl2x ; x*log2Y
f2xm1 ; Y^x-1
fld1 ; 1,Y^x-1
fadd ; Y^x

fmul

fistp V ; Round

inc eax
add BYTE PTR I, 1
sbb ecx, ecx
neg ecx
add BYTE PTR I, cl ; no zeroes
add BYTE PTR GammaValue, cl
jc _x
test V, 8000h
je @B
; bad exit
int 3
; good exit
_x: int 3
I can't be zero, per the FYL2X instruction.

Could fix V afterwards:
mov	ax, V

sar ax, 15
not ax
and V, ax
Edit: this doesn't work either? :confused:
Posted on 2003-05-10 11:37:45 by bitRAKE
OK, here is the original pascal code:



type
TGammaRecord = packed record
R : array[0..255] of word;
G : array[0..255] of word;
B : array[0..255] of word;
end;
...
var MyGammaRec:TGammaRecord
...

/////////Input: GammaValue is between 0 and 255///////////

for I := 0 to 255 do
begin
V := Round(255 * Power(I/255, GammaValue/255));
if V > 255 then V := 255;
V:=V shl 8;
MyGammaRec.R[I] := V;
MyGammaRec.G[I] := V;
MyGammaRec.B[I] := V;
end;


This is my converted code:




.data?
MyGammaRec DW 255*3 dup (?)

..
GammaProc proc GammaValue:DWORD

xor ecx,ecx
mov I,ecx

.WHILE (I<256) ; for I := 0 to 255 do begin

fild a255
fild GammaValue
fdiv st, st(1)

fild I
fdiv st, st(2)

; Y^x st = Y, st(1) = x
fyl2x ; x*log2Y
f2xm1 ; Y^x-1
fld1 ; 1,Y^x-1
fadd ; Y^x
fmul

fistp V ; Round

;At this point, V should be between 0 and 255 but it isn't


mov eax,V
cmp eax,255
jle @F
mov eax,255
@@:

shl eax,8

mov ecx,I
mov word ptr[MyGammaRec+ecx+000*sizeof word],ax
mov word ptr[MyGammaRec+ecx+256*sizeof word],ax
mov word ptr[MyGammaRec+ecx+512*sizeof word],ax


inc I
.endw

ret
endp
Posted on 2003-05-10 11:51:31 by Delight

;At this point, V should be between 0 and 255 but it isn't


mov eax,V
cmp eax,255
jle @F
mov eax,255
@@:
Really? Then why is V compared to 255. :tongue:
Posted on 2003-05-10 12:00:28 by bitRAKE
Good question :grin:
Posted on 2003-05-10 12:07:59 by Delight
OK, that code is not necessary, I commented it out from the pascal version and it didn't change anything, V is always between 0 and 255 at that point.
Posted on 2003-05-10 12:11:04 by Delight
Okay, this seems to work: :)
_DATA SEGMENT

a255 WORD 255
GammaValue WORD 254
I WORD 1 ; can't be zero
V WORD ?
_DATA ENDS

; V := Round(255 * Power(I / 255, GammaValue / 255));
xor eax, eax
@@:
fild a255

fild GammaValue
fdiv st, st(1)

fild I
fdiv st, st(2)

; Y^x st = Y, st(1) = x
fyl2x ; x*log2Y
fld st(0)
frndint

fsub st(1),st
fxch

f2xm1
fld1
fadd

fscale

fmul st, st(2)

fistp V ; Round

fstp st(0)
fstp st(0)

inc eax
add BYTE PTR I, 1
sbb ecx, ecx
neg ecx
add BYTE PTR I, cl ; no zeroes
add BYTE PTR GammaValue, cl
jc _x
test V, 8000h
je @B
; bad exit
int 3

nop
nop
nop

; good exit
_x: int 3
Posted on 2003-05-10 12:27:35 by bitRAKE

OK, that code is not necessary, I commented it out from the pascal version and it didn't change anything, V is always between 0 and 255 at that point.
As it should be. :grin:

Also, ECX should be multiplied by two to index a WORD array.
Posted on 2003-05-10 12:28:21 by bitRAKE
.data?

MyGammaRec DW 255*3 dup (?)

...
GammaProc proc GammaValue:DWORD

xor ecx,ecx

fild a255
fild GammaValue
fdiv st, st(1)

mov I,ecx
xor eax, eax
jmp @F

.WHILE (ecx<256) ; for I := 0 to 255 do begin

fild I
fdiv st, st(2)

; Y^x st = Y, st(1) = x
fyl2x
fld st(0)
frndint
fsub st(1),st
fxch
f2xm1
fld1
fadd
fscale
fxch
fstp st(0)
fmul st, st(2)
fistp V ; Round

mov ax,WORD PTR V
@@: inc I
ror ax, 8
mov word ptr[MyGammaRec+2*ecx+000*sizeof word],ax
mov word ptr[MyGammaRec+2*ecx+256*sizeof word],ax
mov word ptr[MyGammaRec+2*ecx+512*sizeof word],ax

inc ecx
.endw

fstp st(0) ; GammaValue
fstp st(0) ; 255

ret
endp
Posted on 2003-05-10 12:48:34 by bitRAKE
Delight:

The Fpulib does NOT accept REAL4 nor REAL8 values as input parameters. However, it will accept integers as parameters. For example, change your 1st line of code from:

invoke FpuDiv,addr GammaValue,offset a255,offset temp,SRC1_DMEM or SRC2_DMEM

to:

invoke FpuDiv,addr GammaValue,255,offset temp,SRC1_DMEM or SRC2_DIMM

Change your 2nd and 4th lines similarly. Also make sure that your "temp" variable is declared as a REAL10 (or DT) variable. Your originally posted code using the Fpulib should then work properly.

Another option is to change the declaration of the "a255" variable to the following:

a255 dd 255

and use the code as originally posted.

The main error was to declare that "a255" variable as a REAL8 (which is not permissible as a parameter) and referring to it as a DWORD in memory with the SRC2_DMEM flag.

If the "a255" variable could effectively change to another integer value, you would not have any choice but to retain it and use its address as the parameter. However, if it is a constant (which it seems to be), it's easier to use it as an immediate value whenever permitted by the functions of the Fpulib.

Raymond
Posted on 2003-05-11 00:43:41 by Raymond
bitRAKE:
Thank you so much for your effort, but I still can't get it to work :(
I did a "PrintDec eax" after "mov ax,WORD PTR V" and eax is zero all the time, even though I have tried several different GammaValues. I think I give up and stick to the now working Fpulib version.


Raymond:
Thanks, it works perfectly now :)
Posted on 2003-05-11 03:20:47 by Delight
It certainly is a "delight":tongue: to see that the fpulib is of some use.

Raymond
Posted on 2003-05-11 21:28:48 by Raymond
If there is no need to speed up your process, the following is immaterial.

If you need to improve the overall speed, you could do all calculations without using the FPU, using strictly the CPU with the same accuracy.

I just ran some tests and the speed would be almost the same when the GammaValue is above 225. However, the speed would improve as that value decreases, the time required being reduced to less than half for low values. (The time required using the FPU does not change with the GammaValue).

Let me know if you are interested.

Raymond
Posted on 2003-05-12 22:18:08 by Raymond
Thank you for helping me out Raymond, but speed is not very important in this case. I would like to take a look at your code though, for learing purposes :)
Posted on 2003-05-13 03:00:06 by Delight
Delight:

A few years ago, I prepared a fixed-point math library for MASM32 using only the CPU for all computations. It is limited for values not exceeding +/-32767 and the stated accuracy is 5 significant digits; this is good enough for any graphics work. It is not intended for high precision scientific purposes.

The library is based on using the lower 16 bits of the 32-bit integer for the fractional part, the next 15 bits for the integer portion, and the most significant bit for the sign. The whole is treated as a signed long integer.

Because all maths are done with integers on the CPU, some of the functions are faster than performing similar computations with floating point variables on the FPU. For example, computing a power is about 3 times faster with the CPU on average.

Included in the attached FPUvsCPU.zip file is a short dialog box program comparing the speed on the FPU and CPU performing the calculation of your original post, i.e.

X = 255*(I/255)^(G/255)

Each calculation is performed 100,000 times before displaying the time (in millisec) required and the result of the computation for each. Edit controls with up/down controls let you vary the values of I and G.

The source code and resource file are also included with the EXE in the .zip file. If you are interested, the Mixlib package is available in the next post (haven't learned yet how to attach more than 1 file with a post).

Have fun

Raymond
Posted on 2003-05-13 20:49:42 by Raymond
As mentionned in the previous post, the latest Mixlib package is attached.

Raymond
Posted on 2003-05-13 20:52:44 by Raymond
That's great :alright: Thank you Raymond!
Posted on 2003-05-14 14:09:59 by Delight