Hi,

here my method for the exponentianal function e^x .

it is a mix of 2 fundamentals:

- Taylor series running at a 10 steps for decimals -1.0 < x < +1.0

- a couple of basic rules of the logarithms.

Explanation.

say we want to calculate e^20.3. well,

we convert it to base2 by multiplying it by lg2(e)

it is needed to know later how much we should shift left

our partial result in its integer part.

now, for the fundamental of powers we know that

also we round down 20.3 base2 to get

we store the remainder 0.3 now for later use.

then we convert the integral part 20.0 base2 to an integer value,

20.0 float double -> integer, s

after conversion

s = 1Dh 29 decimal

we shift 1 by s times on the left

3 additional checks are needed here. to avoid

overflow after 63 shifts; in the case s > 63 and

checking wether s push i out of the float double capacity.

we reconvert i to float for later use

we convert the float decimals we stored above in base2

to float decimals in baseE. this is because we want to use

Taylor series from the e^n. i choosed 10 steps max, and having

-1.0 < decimals < +1.0 works enough good.

NOTE: you can extend the range of action of the

Taylor series up to +-8, stepping it ~20 or more times.

recall Taylor now on e^x :

now, according fundamentals of logarithms

we give this x to the Taylor expansion routine.

to get back

finally for the fundamental of powers above

we multiply

The resulting assmbly code can be found on my website at

http://sites.google.com/site/x64lab/home/reloaded-algorithms/my-exp-function-jexp

it is ~40 lines of code (my Taylors's exp() code + main routine)

it accepts only +numbers for now, and makes no exaustive check

on the floats. for those and negative values i leave it to the

reader's creativity ( being e^-x essentially 1 / e^x ).

this is the fastest method i know, it should time ~80 cycles totally,

i didnt check it yet. it's not so important.

of some relevance to me was

1) avoid the Intel Approx Math library license

2) avoid things like the cmath library

3) avoid the 2 FPU slow instructions

FYL2X to compute y * log2(x)

F2XM1 to compute 2^x - 1

because 250/300 cycles for 2 instro

it's the insanity, 100%, pure especially

on tests i am doing from huge RND-data outputs.

4) using Taylor series the right way,

because we would need lot of steps to get

the right values on plugging in large x, as in the example e^20.3

but the true-truth is that i am not yet ready

for Chebyshev; simply because i need some time

to understand something more of his genial calculus.

if you have simplified references about him,

please share it.

and not much time to write a full assembly

math library. if someone is interested to contribute

the library is open source, under MPL license, but assembly

required, please. the library lies under the name

in the same way as my other one,

only on my needs.

Cheers,

here my method for the exponentianal function e^x .

it is a mix of 2 fundamentals:

- Taylor series running at a 10 steps for decimals -1.0 < x < +1.0

- a couple of basic rules of the logarithms.

Explanation.

say we want to calculate e^20.3. well,

` 20.3 float double is 40344CCCCCCCCCCDh hexadecimal`

we convert it to base2 by multiplying it by lg2(e)

`20.3 x 1.4426950408889632824453128103079f`

result 20.3 base2 403D4965C85C0166h

it is needed to know later how much we should shift left

our partial result in its integer part.

now, for the fundamental of powers we know that

` n^20.3 can be rewritten as n^20 * n^0.3`

also we round down 20.3 base2 to get

` 20.0 and 0.3 float double remainder`

20.0 = 403D000000000000h

0.3 = 3FD2597217005980h

we store the remainder 0.3 now for later use.

then we convert the integral part 20.0 base2 to an integer value,

20.0 float double -> integer, s

after conversion

s = 1Dh 29 decimal

we shift 1 by s times on the left

` i = 1 << s`

thus, i = 1 * 2^29 = 20000000h ( 536'870'912 decimal)

3 additional checks are needed here. to avoid

overflow after 63 shifts; in the case s > 63 and

checking wether s push i out of the float double capacity.

we reconvert i to float for later use

` 20000000h integer = 41C0000000000000h float double`

we convert the float decimals we stored above in base2

to float decimals in baseE. this is because we want to use

Taylor series from the e^n. i choosed 10 steps max, and having

-1.0 < decimals < +1.0 works enough good.

NOTE: you can extend the range of action of the

Taylor series up to +-8, stepping it ~20 or more times.

recall Taylor now on e^x :

` e^x = 1 + x + x^2 / 2! + x^3 / 3! .... + x^n / n!`

now, according fundamentals of logarithms

` a) if e^x = 2^q`

b) and generally, lgBASE(x)^n = n * lgBASE(x)

then we can extend a) this way,

x * ln(e) = q * ln(2)

c) and thus x = q * ln(2) should verify the a) as true identity.

now, because ln(2) = lg10(2) / lg10(e)

lg10(2) = 0.30102999566398119521373889472449f

lg10(e) = 0.43429448190325179004808384911378f

ln(2) = 0.69314718055994536943283387715543f

we apply c)

x = q * ln(2) where q = r our remainder

x = 0.3 base2 * 0.69314718055994536943283387715543f

x = 3FC9700ADD042628 baseE

we give this x to the Taylor expansion routine.

to get back

` 3FF3848660139D52 as result of exp() on the remainder 0.3`

finally for the fundamental of powers above

we multiply

` 41C0000000000000h * 3FF3848660139D52h`

n^20 base2 * n^0.3 base2 = n^20.3 base2

to get back

41C3848660139D52h that corresponds exactly

to our decimal 654904512.1532385

The resulting assmbly code can be found on my website at

http://sites.google.com/site/x64lab/home/reloaded-algorithms/my-exp-function-jexp

it is ~40 lines of code (my Taylors's exp() code + main routine)

it accepts only +numbers for now, and makes no exaustive check

on the floats. for those and negative values i leave it to the

reader's creativity ( being e^-x essentially 1 / e^x ).

this is the fastest method i know, it should time ~80 cycles totally,

i didnt check it yet. it's not so important.

of some relevance to me was

1) avoid the Intel Approx Math library license

2) avoid things like the cmath library

3) avoid the 2 FPU slow instructions

FYL2X to compute y * log2(x)

F2XM1 to compute 2^x - 1

because 250/300 cycles for 2 instro

it's the insanity, 100%, pure especially

on tests i am doing from huge RND-data outputs.

4) using Taylor series the right way,

because we would need lot of steps to get

the right values on plugging in large x, as in the example e^20.3

but the true-truth is that i am not yet ready

for Chebyshev; simply because i need some time

to understand something more of his genial calculus.

if you have simplified references about him,

please share it.

and not much time to write a full assembly

math library. if someone is interested to contribute

the library is open source, under MPL license, but assembly

required, please. the library lies under the name

**"amrt"**,in the same way as my other one,

**"art"***a-ssembly-run-t-ime*. i will write/update it from time to timeonly on my needs.

Cheers,