Hi

I have just started to make a small raytraceras a tutorial for 3D, diffuse Phong shading works ok.

However when dealing with specular highlights i need to fast calculate pow(x,y) ie X raised to Y power (x^y) for those of us that have not learned under english school ...

To my surprise FPU lacks such an instruction, hmmm ....

So i understand i could do it via a e^x and then a logarithm...but this looks damn slow and unclear :( to me...

So any examples (in asm) ? and/or optimizations.

BTW. I want to make it a realtime raytracer so speed is of the essence however there will be some source code without optimizations for tutorial ... eh also i am using TASM as always ...

One last thing: i think i can do with real4 simple float precision...

Thanks all

Bogdan

Hehe

Thanks man :) !!

One last thing: there once was a link arround here about a library with source that claimed it can do real4 floating operations and functions much faster than FPU in software... ?

http://www.bmath.net/bmath/index.html :)

Fixed point will be faster because we can make more assumptions, but it is harder to implement at that performance level, imho. MMX/SSE should be used as well because it is over twice as fast on some algorithms.

hi,

here is the way ln

e^(x * ln y)

and that will give you high speed calculations

try it

I have added here a math :

this code is not mine but I have took it from a math include file.

amr

I have added here a math :

```
```

;st to power st1 pow(st,st1) = 2^(st*log2st1)

fPow2 MACRO ; 2^st, 98 clocks

sub esp,16

fist dword ptr [esp+12]

fld1

fstp tbyte ptr [esp]

fisub dword ptr [esp+12]

mov eax,[esp+12]

add [esp+8],eax

f2xm1

fld1

fadd

fld tbyte ptr [esp]

fmul

add esp,16

EndM

fPow MACRO ; st^st(1), 200 clocks

fyl2x

fPow2

EndM

Thanks all

I see some TBYTE reference so i guess above optimizations (looks like Agner's) are for real10?

Do they work for real4 also...?

Still i wonder if real4 will be ok for a realtime raytracer ....

Tested those 2 versions below:

and this (bitrake):

the only code above those macros is:

But i still get negative results even if both specular_exponent and reflect_dot_p are positive.

AFAIK reflect_dot_p is varing between [1.0 , 0.0]

I must be doing something wrong... but how do i get negative values for Y^X when both values are positive?

:stupid:

```
```

;-------------------------------

; calculates Y^X

; st=Y,st(1)=X

;-------------------------------

;fyl2x ;x*log2Y

;f2xm1 ;Y^x-1

;fld1 ;1,Y^x-1

;faddp ;Y^x

and this (bitrake):

```
```

fldl2e

fmul ; A

fld st ; A A

frndint ;*B A

fld1 ; 1 B A

fscale ;*C B A

fxch st(2) ; A B C

fsubp st(1),st ; D C

f2xm1 ; E C

fmul st,st(1) ; F C

fadd ; G

the only code above those macros is:

```
```

.data

specular_factor real4 0.2

specular_exponent real4 2.01

.code

; .....

fld [specular_exponent] ;X

fld [reflect_dot_p] ;Y

That thread is mainly about e^x :)

F2XM1 is limited to range -1 to +1.

yeah but this is y^x ?

and still i get negative values (also e>0 afaik ;) )

```
```

fyl2x ;x*log2Y

f2xm1 ;Y^x-1

fld1 ;1,Y^x-1

faddp ;Y^x

```
fld fpc(<0.5>)
```

fld fpc(<0.5>)

fyl2x ;x*log2Y

f2xm1 ;Y^x-1

fld1 ;1,Y^x-1

faddp st(1), st ;Y^x

Result = 0.7071067811865475728 okay
The other macro don't work. :(

Should Y, X have any special normalized ranges?

I mean i am sure the dot_product is in 0..1 range but the specular_exponent can be 25.67 :) or 0.1

...

F2XM1 is limited to range -1 to +1. Therefore, ABS(x*log2Y) <= 1.