hex | decimal | exponential decimal | exponential binary
---------------------------------------------------------------------------------
(3D800000)-> 1,031,798,784 .1031798784 x 10^10 .1111011 x 2^31


|
|
|

[0][10011111][1111011]
sign bit - exponent - mantissa

in other words is 3D800000 == to [0][10011111][1111011] in binary form using the IEEE standard
did i do this right? -thx
Posted on 2010-11-26 19:06:00 by dougfunny
did i do this right? -thx


NO.

The IEEE standard you are refering to is for the representation of floating point real numbers. There are three types covered for the PC, having various names depending on your programming language:
float, single-precision, REAL4, 32-bit, dword
long, double-precision, REAL8, 64-bit, qword
long-long, extended-double-precision, REAL10, 80-bit, tbyte

If you want to know more about the "format" of those types, you may want to look at the following:
http://www.ray.masmcode.com/tutorial/fpuchap2.htm#floats
Posted on 2010-11-26 21:19:57 by Raymond
sorry but i still think that awnser is incorrect. i keep getting
0 10011110 11110110000000000000000

i have no idea how i am incorrectly doing this....

Posted on 2010-11-27 19:14:05 by dougfunny
Try it this way

  3    D    8    0    0    0    0    0
0011 1101 1000 0000 0000 0000 0000 0000

then rearrange in fields for a 32-bit float

0 01111011 00000000000000000000000


The first bit = 0, thus the number is positive.
The exponent is 7Bh - 7Fh = -4    (7Fh is the bias)
The mantissa is 1.00000000000000000000000 (the 1 is implied)

If 3D800000h represents a 32-bit float in IEEE format,
it would thus represent 1.000...*2-4 (or exactly 1/16 = 0.0625 in the decimal system).
Posted on 2010-11-27 20:13:43 by Raymond
i just dont see how that is equivalent to 1,031,798,784 in decimal because 1.0x2^-4 == (1/16) which is what u said? what am i missing here??

7Bh - 7Fh = -4    (7Fh is the bias)
i thought to get the computer equivalent form of the exponent u add the excess notation which is 128?  i dont see how u get -4..

i think our teacher taught us wrong....who woulda known he worked at bell labs...sigh

but can u plz explain how 1/16 == to 1,031,798,784
Posted on 2010-11-27 20:25:54 by dougfunny
what am i missing here??


What you are refusing to understand is that the IEEE format for representing floating points is based on interpreting bit fields within a given set of bytes. That set of bytes has no relation to its equivalent integer values.

Take another look in the given link at how the value of -211/8 (i.e. -26.375) gets represented as C1D30000h in the IEEE format.
Posted on 2010-11-27 20:56:05 by Raymond
Another exercise you may want to do based on the info in the given link is to represent the value of 0.0625 in double-precision (64 bit) and extended-double-precision (80 bit) IEEE format. You should get the following:

3FB0000000000000 ( = 0.0625 when interpreted as a 64-bit double precision IEEE float)
3FFB8000000000000000 ( = 0.0625 when interpreted as an 80-bit extended-double-precision IEEE float)

You may also want to play with the attached program. It was designed to test the various functions of a library for floating point computations (available from the same site as the above link) but made available for public use. Read the included text file for guidance on how to use the tester.
Attachments:
Posted on 2010-11-27 23:58:48 by Raymond
ok well i have 2 more questions then i will let you be :)

1.) The number i was originally trying to obtain in exponential form, the mantissa is all 0(besides the implied 1).  Does this mean it's impossible to normalize this number?

2.) What you are refusing to understand is that the IEEE format for representing floating points is based on interpreting bit fields within a given set of bytes. That set of bytes has no relation to its equivalent integer values.
^
why is this form even used if we cannot get the value were wanting ?!
Posted on 2010-11-28 09:57:36 by dougfunny
Does this mean it's impossible to normalize this number?


If you had continued reading the given link, there is a section describing denormalized REAL numbers. Those are the ones where the exponent field is effectively 0 (i.e. ALL its bits are 0). If any of those bits is a 1, the number is NOT denormalized even if all the bits of the "mantissa" are 0's. It would already be a normal REAL number equal to 1.0000...*2exp, whatever the exponent field would yield.

why is this form even used if we cannot get the value were wanting ?!


The integer form you are familiar with does not lend itself to all those small fractional numbers between 0 and 1, nor to all those intermediate fractional values between integers. And, even with 32-bit integers, you are limited to signed values barely exceeding +/-2,000,000,000.

In the early days of computing, companies attempted to circumvent that by each creating their own internal data formats which were all different. Eventually, the IEEE developed a standard format which became universally adopted. Once you learn the format, you can get the value you want.
Posted on 2010-11-28 12:38:34 by Raymond
IEEE Floating point by example, convert human readable (base 10) number to HEX(binary) (computer readable) IEEE and back

I'll take your number: 0.0625

S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF --- sign exponent fraction
        1.00000000000000000000000 (***) --- implied bit

we will remove floating/virtual point out of our way:

100000000000000000000000. * 2^-23
100000000000000000000000. <- virtual point is here

0.0625 =  625 * 10 ^ -4 = 1001110001. * 10^-4
We want to transform our number to be as close as possible to 2^25 (and without decimal exponent)

shift out the leading zeros and decrease binary exponent:
100111000100000000000000. * 10^-4 * 2^-14

100111000100000000000000. * 10^-4 * 2^-14 // divide by 10, increase decimal exponent by 1
000011111010000000000000. * 10^-3 * 2^-14 // shift to 24 bits! -- left by 4, decrease binary exponent by 4
111110100000000000000000. * 10^-3 * 2^-18 // divide by 10
000110010000000000000000. * 10^-2 * 2^-18 // << 3, decrease binary exponent by 3
110010000000000000000000. * 10^-2 * 2^-21 // divide by 10
000101000000000000000000. * 10^-1 * 2^-21 // << 3
101000000000000000000000. * 10^-1 * 2^-24 // :10
000100000000000000000000. * {10^0}* 2^-24 // << 3
100000000000000000000000. * {10^0}* 2^-27

at any point you decide to multiply these numbers you will get 0.0625

so we have:
100000000000000000000000. * {10^0}* 2^-27

remember where virtual point is in IEEE? lets shift it there (***)

1.00000000000000000000000         * 2^-4 // add 23 to binary exponent

next we cut off the implied bit and store 23 bits [00000000000000000000000]

next we add 127(bias) to -4 and store exponent == -4+127 = 123 = [1111011]

was the number signed? no store {0}

finally we have

{0}[01111011][00000000000000000000000]

00111101100000000000000000000000 = 3D800000

------------ lets go back --------------

00111101100000000000000000000000


{0}[01111011][00000000000000000000000]

E=123, subtract 127 = -4

00000000000000000000000, add implied 1

1.00000000000000000000000 * 2^-4

shift the virtual point to right 23

100000000000000000000000. * 2^-27 // make room for multiplication by 10. 10 has 4 bits, so shift right by 4

Why? RULE: if X has N bits and Y has M bits then result of X*Y will have at most N+M bits

We always want our number as close as possible to 2^25 to preserve as much bits as possible.

so:

000010000000000000000000. * 2^-23 * 10^0  // multiply by 10
010100000000000000000000. * 2^-23 * 10^-1 // >> 3

000010100000000000000000. * 2^-20 * 10^-1 // * 10
011001000000000000000000. * 2^-20 * 10^-2 // >> 3

000011001000000000000000. * 2^-17 * 10^-2 // * 10
011111010000000000000000. * 2^-17 * 10^-3 // >> 3

000011111010000000000000. * 2^-14 * 10^-3 // * 10
100111000100000000000000. * 2^-14 * 10^-4

you of course do first shift out the zero bits but i left it to be similar to the first part

so shift out the zero bits:

1001110001[00000000000000]. * 2^-14 * 10^-4

1001110001. * 10^-4

no more binary exponent, we are done.

HTH,
drizz


Posted on 2010-11-28 19:47:35 by drizz