I need someone with maths a little less rusty than mine to have a look at this simple algo.



MACRO
FLOAT8 MACRO name,value
.data
name REAL8 value
.code
ENDM

CODE
FLOAT8 source_num,10.00 ; source number
mov ecx, 5 ; power
dec ecx

fld source_num
@@:
fld source_num
fmul
dec ecx
jnz @B
fstp result

invoke FloatToStr2,result,ADDR buffer
invoke MessageBox,hWin,ADDR buffer,
SADD("Floating Point Exponent 10^5"),MB_OK

It seems to be producing the correct results but I would not be surprised if there is a cleaner and tidier way to do it.

Regards,

hutch@movsd.com
Posted on 2002-03-13 07:42:53 by hutch--
Sorry I'm nowhere near a win-box or my docs so it's pseudo-code only but I've used the following alorithm.

You keep a "double" figure & muliply that in when the appropriate bit is set. So you loose when power is small like 5, well actually 6 should be the break-even point.

You'll be doubling the multiplies per iteration but only do log2(power) iterations instead of 1*power of them. But as always big-O can hide garabge.



Algorithm goes like:
answer = 1
multiply = source_num
power = bit test high

top:
shr power, 1 ;test low bit
jc @f ;if not set don't multiply "multiply" in

answer *= mult ;If bit is set multiply it in
@@:
mult *= mult ;Double the multiplier
dec pow
jnz top


Huge amount of twiddling for one-off errors & instruction formats is needed but you can see the general idea.
Posted on 2002-03-13 15:43:15 by Mutant Slime
Finally got off my fat ass & coded something. This is the 1st time I've worked with the FPU so it may not be the most efficient but it works.


.DATA
num REAL8 -2.0 ;Powers of 2 JUST for debugging

.DATA?
res REAL8 ?
buff DB 256 DUP(?)

.CODE
Start:

MOV EAX, 5 ;Init power, should also test for range!
FLD1 ;Init result
FLD num ;Init multiplier

top:
SHR EAX, 1 ;Do we need this multiplier?
JNC @F
FMUL ST(1), ST(0) ;Put multiplier into result
@@:
FMUL ST(0), ST(0) ;Square the multiplier
TEST EAX, EAX ;Exit condition
JNZ top

FSTP res ;Get rid of multiplier
FSTP res ;Ta da!

invoke FloatToStr2, res, ADDR buff
invoke MessageBox, NULL, ADDR buff, ADDR buff, MB_OK

invoke ExitProcess, NULL

END Start
Posted on 2002-03-13 21:42:29 by Mutant Slime
Nice solution Mutant Slime - slight mod:
    CLC

top:
RCR EAX, 2 ;[b]ERROR[/b];Do we need this/next multiplier? :)

JNS @F ; top bit?
FMUL ST(1), ST(0) ;Put multiplier into result
@@:
FMUL ST(0), ST(0) ;Square the multiplier

JNC @F ; carry! :)
FMUL ST(1), ST(0) ;Put multiplier into result
@@:
AND EAX, 7FFFFFFFh ;clear top bit :) Exit condition
FMUL ST(0), ST(0) ;Square the multiplier
JNZ top
Edit: This doesn't work! RCR doesn't effect the sign flag. :(
Posted on 2002-03-13 22:11:10 by bitRAKE
I'll give here maybe not optimal from speed (though short), but dedactive
version of my code.
From my math point of view school explonation of what is power of
number is incorrect from algebra point of view.
It's illustrative of course from geometric point of view (esp. in square and cube
examples)
Usual school explonation in short is
x^n = x* x* ...x n times.
it gives us 2 well known notes that
X^0 = 1
X^1 = X

From point of math induction it is not correct definition if it gives such
special notes (exeptions ?).

So before my code I'll try to give my own defenition (you need
to excuse my English in such delicate issue as given math definitions
in not native language)
:
n power of X (X^n) is n(th) member (zerobased indexing) of recurrent sequence
wich (sequence) starts with 1 and next member of sequance equals previous multipyed
by X. ?
For example with X = 3 the recurrent sequance will be:


a[0]=1 ;3^0=1
a[1]=a[0]*3 ;3*1=3=3^1
a[2]=a[1]*3 ;3*3=9=3^2
...
a[n]=a[n-1]*3

any recurrent sequance is well thing for algo.
So from the above definition we can create algo wich handles x^0 or x^1 the same way as
any x^n


;ecx = n in x^n formula
test ecx,ecx
fld1
je write
@@: fmul x
dec ecx
jne @B
write: fstp result

as you can see x^0 will be 1 and x^1 will be X without needs of special cases.
Posted on 2002-03-13 23:02:39 by The Svin
Need to JUMP INTO the loop & you can elminate one fmul. Also this version will match naive version for powers 0 to 3 & then improvement kicks in.



MOV EAX, 7 ;Init power, should also test for range!
FLD1 ;Init result
FLD num ;Init multiplier
XOR EDX, EDX ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;DELETE ME
jmp first ;Eliminiate one FMUL by jumping into loop

top:
FMUL ST(0), ST(0) ;Square the multiplier
INC EDX ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;DELETE ME
first:
SHR EAX, 1 ;Do we need this multiplier?
JNC @F
FMUL ST(1), ST(0) ;Put multiplier into result
INC EDX ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;DELETE ME
@@:
TEST EAX, EAX ;Exit condition
JNZ top


Cool ideas bitRAKE & Svin let me look at them some more. I like where you're going with the loop unroll particularly.
Posted on 2002-03-13 23:32:23 by Mutant Slime
Of course an obvious note for fast code:
X^y*X^z=X^(y+z)
There is many ways to decrease number of fmuls, by
representing (and in sequence) n in X^n as n = a + b
for example n^4 = n^2*n^2 therefor

fld x
fmul st0,st0 ;X^2
fmul st0,st0 ;X^4
fstp result

and so on...
Posted on 2002-03-13 23:47:34 by The Svin
;eax = n in x^n formula

fld1
fld x
shr eax,1
jnc _2
_1: fmul st(1),st(0)
_2: shr eax,1 ; this does set Z flag
fmul st(0),st(0)
jc _1
jne _2
fcomp
fstp result
We reach infinity rather quickly - no need to unroll.
Posted on 2002-03-14 00:46:40 by bitRAKE
Just shoved the next version into a proc, input is REAL8 plus power as DWORD. Output is unpopped result in st(0). This one has a 3 instruction loop, does the x^1 or x^0 as seperate branches. It seems to be delivering the correct results.


; ########################################

exponent proc fpsource:REAL8,power:DWORD

mov ecx, power
dec ecx

fld fpsource

cmp ecx, 0
jl expout2
jz expout

; ==== loop code ====
@@:
fmul fpsource
dec ecx
jnz @B
; ===================

expout:
ret

expout2:
fstp fpsource ; balance stack
fld1 ; set constant

ret

exponent endp

; ########################################


Regards,

hutch@movsd.com
Posted on 2002-03-14 03:55:35 by hutch--
fsub st(0),st(0)

X^0 = 1
not 0.
Posted on 2002-03-14 04:41:45 by The Svin
Alex,

OK, its the exceptions I am not familiar with, I wrote this one so that,

x^0 = 0
x^1 = x
x^2 = x*x

etc ....

LATER : Have changed it as above.

Regards,

hutch@movsd.com
Posted on 2002-03-14 05:40:54 by hutch--
Try this instead:


mypow PROC public a:REAL4, b:REAL4
LOCAL cw:WORD
LOCAL cwtemp:WORD

fstcw [cw] ; Save current control word
mov ax, [cw]
or ax, 0C00h ; Set rounding control to chop
mov [cwtemp], ax
fldcw [cwtemp]

; c = exp(b*log(a));
fld [b]
fld [a]
fyl2x ; st(0) = b*(2 log(a))
fld st(0) ; Duplicate log
frndint ; Round to integer. st(1) = b*(2 log(a)), st(0) = (int)(b*(2 log(a)))
fld st(1) ; ...and duplicate log again
fsubr ; st(0) = (b*(2 log(a)) - b*((int)(2 log(a))) (fraction only)
f2xm1 ; st(0) = 2^st(0) - 1 (2^x float portion)
fld1 ; add 1 again (why did that bastard sub it? :)
fadd
fscale ; Calc the 2^x integer portion, and multiply with 2^x float portion, to get complete 2^x (based on 2^i * 2^f = 2^(i+f))
fstp st(1) ; Clean up stack (fscale doesn't pop the scale argument...)
fldcw [cw] ; Restore control word

ret
mypow ENDP


You should be able to move control word setup to start of your
program if you know what you'll be doing, but this was written as
a replacement for libc pow() for use in small apps, where you can't
always depend on the control word.

Can easily be changed to use REAL8... guess how ;)

edit: forgot to mention, this is scalis code.
Posted on 2002-03-14 06:42:21 by f0dder
Nice code fodder.
Posted on 2002-03-14 08:30:32 by dxantos
Steve, you don't need cmp ecx,0
dec ecx already sets all nessesary flags
js - for x^0
je - for x^1
Posted on 2002-03-14 10:19:39 by The Svin
Behold the power of logarithms!

Much better. Damn! I should have thought of that. Good catch f0dder & nice code Scalis.
Posted on 2002-03-14 10:51:39 by Mutant Slime
scali's algo can be sped up by using the following method for e^X
http://www.asmcommunity.net/board/index.php?topic=2979

Slow Instructions:
fscale
fstcw
fldcw
Posted on 2002-03-14 11:57:58 by bitRAKE
OK, scratch that from my todo list :grin: because you beat me to it by a few months. I'l have to look into this code too. Don't think I'll be able to tweak it more. :Sigh:

Thanks
Posted on 2002-03-14 12:15:35 by Mutant Slime
Algo was written for clarity & size - to get rid of pow() from libc
(since it was the last thing I needed to get rid of to use tinyfmod
in my project.) Can't really be bothered to rewrite it with your
faster code, but feel free to do so yourself :).
Posted on 2002-03-14 12:50:32 by f0dder
f00der,
the math logic used in your method is good when we expect that
power may be either fractional or big.
In other case I would use integer method.

Here is code I use if power maybe big:


;in st, st1:
; fld power
; fld x
fyl2x
fld1
fld st(1)
fprem
f2xm1
fadd
fscale
fxch st(1)
fstp st
Posted on 2002-03-23 01:16:33 by The Svin