hex  decimal  exponential decimal  exponential binary

(3D800000)> 1,031,798,784 .1031798784 x 10^10 .1111011 x 2^31



[0][10011111][1111011]
sign bit  exponent  mantissa
in other words is 3D800000 == to [0][10011111][1111011] in binary form using the IEEE standard
did i do this right? thx

(3D800000)> 1,031,798,784 .1031798784 x 10^10 .1111011 x 2^31



[0][10011111][1111011]
sign bit  exponent  mantissa
in other words is 3D800000 == to [0][10011111][1111011] in binary form using the IEEE standard
did i do this right? thx
did i do this right? thx
NO.
The IEEE standard you are refering to is for the representation of floating point real numbers. There are three types covered for the PC, having various names depending on your programming language:
float, singleprecision, REAL4, 32bit, dword
long, doubleprecision, REAL8, 64bit, qword
longlong, extendeddoubleprecision, REAL10, 80bit, tbyte
If you want to know more about the "format" of those types, you may want to look at the following:
http://www.ray.masmcode.com/tutorial/fpuchap2.htm#floats
sorry but i still think that awnser is incorrect. i keep getting
0 10011110 11110110000000000000000
i have no idea how i am incorrectly doing this....
0 10011110 11110110000000000000000
i have no idea how i am incorrectly doing this....
Try it this way
The first bit = 0, thus the number is positive.
The exponent is 7Bh  7Fh = 4 (7Fh is the bias)
The mantissa is 1.00000000000000000000000 (the 1 is implied)
If 3D800000h represents a 32bit float in IEEE format,
it would thus represent 1.000...*2^{4} (or exactly 1/16 = 0.0625 in the decimal system).
3 D 8 0 0 0 0 0
0011 1101 1000 0000 0000 0000 0000 0000
then rearrange in fields for a 32bit float
0 01111011 00000000000000000000000
The first bit = 0, thus the number is positive.
The exponent is 7Bh  7Fh = 4 (7Fh is the bias)
The mantissa is 1.00000000000000000000000 (the 1 is implied)
If 3D800000h represents a 32bit float in IEEE format,
it would thus represent 1.000...*2^{4} (or exactly 1/16 = 0.0625 in the decimal system).
i just dont see how that is equivalent to 1,031,798,784 in decimal because 1.0x2^4 == (1/16) which is what u said? what am i missing here??
7Bh  7Fh = 4 (7Fh is the bias)
i thought to get the computer equivalent form of the exponent u add the excess notation which is 128? i dont see how u get 4..
i think our teacher taught us wrong....who woulda known he worked at bell labs...sigh
but can u plz explain how 1/16 == to 1,031,798,784
7Bh  7Fh = 4 (7Fh is the bias)
i thought to get the computer equivalent form of the exponent u add the excess notation which is 128? i dont see how u get 4..
i think our teacher taught us wrong....who woulda known he worked at bell labs...sigh
but can u plz explain how 1/16 == to 1,031,798,784
what am i missing here??
What you are refusing to understand is that the IEEE format for representing floating points is based on interpreting bit fields within a given set of bytes. That set of bytes has no relation to its equivalent integer values.
Take another look in the given link at how the value of 211/8 (i.e. 26.375) gets represented as C1D30000h in the IEEE format.
Another exercise you may want to do based on the info in the given link is to represent the value of 0.0625 in doubleprecision (64 bit) and extendeddoubleprecision (80 bit) IEEE format. You should get the following:
3FB0000000000000 ( = 0.0625 when interpreted as a 64bit double precision IEEE float)
3FFB8000000000000000 ( = 0.0625 when interpreted as an 80bit extendeddoubleprecision IEEE float)
You may also want to play with the attached program. It was designed to test the various functions of a library for floating point computations (available from the same site as the above link) but made available for public use. Read the included text file for guidance on how to use the tester.
3FB0000000000000 ( = 0.0625 when interpreted as a 64bit double precision IEEE float)
3FFB8000000000000000 ( = 0.0625 when interpreted as an 80bit extendeddoubleprecision IEEE float)
You may also want to play with the attached program. It was designed to test the various functions of a library for floating point computations (available from the same site as the above link) but made available for public use. Read the included text file for guidance on how to use the tester.
ok well i have 2 more questions then i will let you be :)
1.) The number i was originally trying to obtain in exponential form, the mantissa is all 0(besides the implied 1). Does this mean it's impossible to normalize this number?
2.) What you are refusing to understand is that the IEEE format for representing floating points is based on interpreting bit fields within a given set of bytes. That set of bytes has no relation to its equivalent integer values.
^
why is this form even used if we cannot get the value were wanting ?!
1.) The number i was originally trying to obtain in exponential form, the mantissa is all 0(besides the implied 1). Does this mean it's impossible to normalize this number?
2.) What you are refusing to understand is that the IEEE format for representing floating points is based on interpreting bit fields within a given set of bytes. That set of bytes has no relation to its equivalent integer values.
^
why is this form even used if we cannot get the value were wanting ?!
Does this mean it's impossible to normalize this number?
If you had continued reading the given link, there is a section describing denormalized REAL numbers. Those are the ones where the exponent field is effectively 0 (i.e. ALL its bits are 0). If any of those bits is a 1, the number is NOT denormalized even if all the bits of the "mantissa" are 0's. It would already be a normal REAL number equal to 1.0000...*2^{exp}, whatever the exponent field would yield.
why is this form even used if we cannot get the value were wanting ?!
The integer form you are familiar with does not lend itself to all those small fractional numbers between 0 and 1, nor to all those intermediate fractional values between integers. And, even with 32bit integers, you are limited to signed values barely exceeding +/2,000,000,000.
In the early days of computing, companies attempted to circumvent that by each creating their own internal data formats which were all different. Eventually, the IEEE developed a standard format which became universally adopted. Once you learn the format, you can get the value you want.
IEEE Floating point by example, convert human readable (base 10) number to HEX(binary) (computer readable) IEEE and back
I'll take your number: 0.0625
S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF  sign exponent fraction
1.00000000000000000000000 (***)  implied bit
we will remove floating/virtual point out of our way:
100000000000000000000000. * 2^23
100000000000000000000000. < virtual point is here
0.0625 = 625 * 10 ^ 4 = 1001110001. * 10^4
We want to transform our number to be as close as possible to 2^25 (and without decimal exponent)
shift out the leading zeros and decrease binary exponent:
100111000100000000000000. * 10^4 * 2^14
100111000100000000000000. * 10^4 * 2^14 // divide by 10, increase decimal exponent by 1
000011111010000000000000. * 10^3 * 2^14 // shift to 24 bits!  left by 4, decrease binary exponent by 4
111110100000000000000000. * 10^3 * 2^18 // divide by 10
000110010000000000000000. * 10^2 * 2^18 // << 3, decrease binary exponent by 3
110010000000000000000000. * 10^2 * 2^21 // divide by 10
000101000000000000000000. * 10^1 * 2^21 // << 3
101000000000000000000000. * 10^1 * 2^24 // :10
000100000000000000000000. * {10^0}* 2^24 // << 3
100000000000000000000000. * {10^0}* 2^27
at any point you decide to multiply these numbers you will get 0.0625
so we have:
100000000000000000000000. * {10^0}* 2^27
remember where virtual point is in IEEE? lets shift it there (***)
1.00000000000000000000000 * 2^4 // add 23 to binary exponent
next we cut off the implied bit and store 23 bits [00000000000000000000000]
next we add 127(bias) to 4 and store exponent == 4+127 = 123 = [1111011]
was the number signed? no store {0}
finally we have
{0}[01111011][00000000000000000000000]
00111101100000000000000000000000 = 3D800000
 lets go back 
00111101100000000000000000000000
{0}[01111011][00000000000000000000000]
E=123, subtract 127 = 4
00000000000000000000000, add implied 1
1.00000000000000000000000 * 2^4
shift the virtual point to right 23
100000000000000000000000. * 2^27 // make room for multiplication by 10. 10 has 4 bits, so shift right by 4
Why? RULE: if X has N bits and Y has M bits then result of X*Y will have at most N+M bits
We always want our number as close as possible to 2^25 to preserve as much bits as possible.
so:
000010000000000000000000. * 2^23 * 10^0 // multiply by 10
010100000000000000000000. * 2^23 * 10^1 // >> 3
000010100000000000000000. * 2^20 * 10^1 // * 10
011001000000000000000000. * 2^20 * 10^2 // >> 3
000011001000000000000000. * 2^17 * 10^2 // * 10
011111010000000000000000. * 2^17 * 10^3 // >> 3
000011111010000000000000. * 2^14 * 10^3 // * 10
100111000100000000000000. * 2^14 * 10^4
you of course do first shift out the zero bits but i left it to be similar to the first part
so shift out the zero bits:
1001110001[00000000000000]. * 2^14 * 10^4
1001110001. * 10^4
no more binary exponent, we are done.
HTH,
drizz
I'll take your number: 0.0625
S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF  sign exponent fraction
1.00000000000000000000000 (***)  implied bit
we will remove floating/virtual point out of our way:
100000000000000000000000. * 2^23
100000000000000000000000. < virtual point is here
0.0625 = 625 * 10 ^ 4 = 1001110001. * 10^4
We want to transform our number to be as close as possible to 2^25 (and without decimal exponent)
shift out the leading zeros and decrease binary exponent:
100111000100000000000000. * 10^4 * 2^14
100111000100000000000000. * 10^4 * 2^14 // divide by 10, increase decimal exponent by 1
000011111010000000000000. * 10^3 * 2^14 // shift to 24 bits!  left by 4, decrease binary exponent by 4
111110100000000000000000. * 10^3 * 2^18 // divide by 10
000110010000000000000000. * 10^2 * 2^18 // << 3, decrease binary exponent by 3
110010000000000000000000. * 10^2 * 2^21 // divide by 10
000101000000000000000000. * 10^1 * 2^21 // << 3
101000000000000000000000. * 10^1 * 2^24 // :10
000100000000000000000000. * {10^0}* 2^24 // << 3
100000000000000000000000. * {10^0}* 2^27
at any point you decide to multiply these numbers you will get 0.0625
so we have:
100000000000000000000000. * {10^0}* 2^27
remember where virtual point is in IEEE? lets shift it there (***)
1.00000000000000000000000 * 2^4 // add 23 to binary exponent
next we cut off the implied bit and store 23 bits [00000000000000000000000]
next we add 127(bias) to 4 and store exponent == 4+127 = 123 = [1111011]
was the number signed? no store {0}
finally we have
{0}[01111011][00000000000000000000000]
00111101100000000000000000000000 = 3D800000
 lets go back 
00111101100000000000000000000000
{0}[01111011][00000000000000000000000]
E=123, subtract 127 = 4
00000000000000000000000, add implied 1
1.00000000000000000000000 * 2^4
shift the virtual point to right 23
100000000000000000000000. * 2^27 // make room for multiplication by 10. 10 has 4 bits, so shift right by 4
Why? RULE: if X has N bits and Y has M bits then result of X*Y will have at most N+M bits
We always want our number as close as possible to 2^25 to preserve as much bits as possible.
so:
000010000000000000000000. * 2^23 * 10^0 // multiply by 10
010100000000000000000000. * 2^23 * 10^1 // >> 3
000010100000000000000000. * 2^20 * 10^1 // * 10
011001000000000000000000. * 2^20 * 10^2 // >> 3
000011001000000000000000. * 2^17 * 10^2 // * 10
011111010000000000000000. * 2^17 * 10^3 // >> 3
000011111010000000000000. * 2^14 * 10^3 // * 10
100111000100000000000000. * 2^14 * 10^4
you of course do first shift out the zero bits but i left it to be similar to the first part
so shift out the zero bits:
1001110001[00000000000000]. * 2^14 * 10^4
1001110001. * 10^4
no more binary exponent, we are done.
HTH,
drizz