Hi

I have just started to make a small raytraceras a tutorial for 3D, diffuse Phong shading works ok.

However when dealing with specular highlights i need to fast calculate pow(x,y) ie X raised to Y power (x^y) for those of us that have not learned under english school ...

To my surprise FPU lacks such an instruction, hmmm ....

So i understand i could do it via a e^x and then a logarithm...but this looks damn slow and unclear :( to me...

So any examples (in asm) ? and/or optimizations.

BTW. I want to make it a realtime raytracer so speed is of the essence however there will be some source code without optimizations for tutorial ... eh also i am using TASM as always ...

One last thing: i think i can do with real4 simple float precision...

Thanks all
Bogdan
Posted on 2002-12-30 01:09:57 by BogdanOntanu
Posted on 2002-12-30 01:25:08 by bitRAKE
Hehe

Thanks man :) !!

One last thing: there once was a link arround here about a library with source that claimed it can do real4 floating operations and functions much faster than FPU in software... ?
Posted on 2002-12-30 01:31:05 by BogdanOntanu
http://www.bmath.net/bmath/index.html :)

Fixed point will be faster because we can make more assumptions, but it is harder to implement at that performance level, imho. MMX/SSE should be used as well because it is over twice as fast on some algorithms.
Posted on 2002-12-30 01:32:59 by bitRAKE
hi,
here is the way ln

e^(x * ln y)

and that will give you high speed calculations
try it

I have added here a math :
``````
;st to power st1			pow(st,st1) = 2^(st*log2st1)
fPow2 MACRO ; 2^st, 98 clocks
sub esp,16
fist dword ptr [esp+12]
fld1
fstp tbyte ptr [esp]
fisub dword ptr [esp+12]
mov eax,[esp+12]
f2xm1
fld1
fld tbyte ptr [esp]
fmul
EndM

fPow MACRO ; st^st(1), 200 clocks
fyl2x
fPow2
EndM

``````

this code is not mine but I have took it from a math include file.
amr
Posted on 2002-12-30 01:35:09 by amr
Thanks all

I see some TBYTE reference so i guess above optimizations (looks like Agner's) are for real10?
Do they work for real4 also...?

Still i wonder if real4 will be ok for a realtime raytracer ....
Posted on 2002-12-30 01:47:45 by BogdanOntanu
Tested those 2 versions below:

``````
;-------------------------------
; calculates Y^X
; st=Y,st(1)=X
;-------------------------------
;fyl2x				;x*log2Y
;f2xm1				;Y^x-1
;fld1				;1,Y^x-1
``````

and this (bitrake):

``````

fldl2e
fmul            ; A
fld st          ; A  A
frndint         ;*B  A
fld1            ; 1  B  A
fscale          ;*C  B  A
fxch st(2)      ; A  B  C
fsubp st(1),st  ; D  C
f2xm1           ; E  C
fmul st,st(1)   ; F  C
``````

the only code above those macros is:
``````
.data
specular_factor		real4	0.2
specular_exponent	real4	2.01
.code

; .....
fld	[specular_exponent]	;X
fld	[reflect_dot_p]		;Y
``````

But i still get negative results even if both specular_exponent and reflect_dot_p are positive.
AFAIK reflect_dot_p is varing between [1.0 , 0.0]

I must be doing something wrong... but how do i get negative values for Y^X when both values are positive?

:stupid:
Posted on 2002-12-30 04:09:39 by BogdanOntanu
F2XM1 is limited to range -1 to +1.
Posted on 2002-12-30 05:07:17 by bitRAKE
yeah but this is y^x ?

``````
fyl2x				;x*log2Y
f2xm1				;Y^x-1
fld1				;1,Y^x-1

``````

and still i get negative values (also e>0 afaik ;) )
Posted on 2002-12-30 05:12:10 by BogdanOntanu
``````	fld	fpc(<0.5>)
fld	fpc(<0.5>)
fyl2x				;x*log2Y
f2xm1				;Y^x-1
fld1				;1,Y^x-1