I've just started using sse but quicly ran into a problem... How can I "mov" the value of a 128bit to a string?
Is there some kind of TwordToAscii function?
FireToCelc proc x:DWORD
; formula: (5*(x-32))/9
; or: (0.5555555555555556)*(x-32)
movss xmm0, x
movss xmm1, _32
subss xmm0, xmm1 ; xmm0 = x - 32
movss xmm1, _5o9
mulss xmm0, xmm1 ; xmm0 = (0.5555555555555556)*(x-32)
ret
FireToCelc endp
Is there some kind of TwordToAscii function?
C library (msvcrt.dll) has the sprintf function. Calling it with "%f" (or "%+.1f" if you want the temperature) format is the fastest method. Alternatively, you can write your own "float2ascii".
Using:
Gives me an error.... Do I need to first convert the 128bit register into a 32bit one?
invoke wsprintf, addr buffer, addr lpFmt, xmm0
Gives me an error.... Do I need to first convert the 128bit register into a 32bit one?
not user32.dll's wsprintf, but msvcrt.dll's sprintf. wsprintf doesn't support floating point values.
sprintf should be called exactly the same as wsprintf, but you CAN'T supply a XMM register as the operand, because it's both 128-bit and packed. Store the result somewhere in memory and supply that value as the third parameter.
something like:
and make sure that the buffer is large enough to hold the string. Otherwise you may get 'buffer overrun' which may lead to Denial of Service. I always make the buffers 256-bytes long ^^"
sprintf should be called exactly the same as wsprintf, but you CAN'T supply a XMM register as the operand, because it's both 128-bit and packed. Store the result somewhere in memory and supply that value as the third parameter.
something like:
movss , xmm0
invoke sprintf, addr buffer, addr lpFmt, result
and make sure that the buffer is large enough to hold the string. Otherwise you may get 'buffer overrun' which may lead to Denial of Service. I always make the buffers 256-bytes long ^^"
Oh :P Still I don't know how to link to msvcrt.dl ... I searched trough masm32's include and lib folders but couldn't find anything....
the include file isn't really necessary (at least in TASM ^^"). As for the lib: use something like "implib" to make LIBs from DLLs. implib produces LIBs for TASM.
That's why I use(d) TASM: much less trouble :P ;)
This might be useful if you opt to write your own float2ascii: > * <
That's why I use(d) TASM: much less trouble :P ;)
This might be useful if you opt to write your own float2ascii: > * <
Would it be something like this then?
FloatToAscii proc float:QWORD, lpOut:DWORD
LOCAL temp:DWORD, temp2:DWORD
.data
Milion dd 1000000
.code
; turn to truncation mode?
finit
fld float
fist temp
fsub temp
; turn to round-to-nearest-integer mode?
fmul Milion
fistp temp2
; now reprent it as "temp . temp2"
ret
FloatToAscii endp
Store million as 1000000.0 (floating point value). finit initializes the FPU, so you should do finit first and change the rounding mode AFTER it. A good habit is to do one "finit" at the beginning of your program if you plan to use the FPU. No more finits are required (unless you're doing very funny things ^^").
1. enable truncation:
2. enable rounding to nearest integer:
controlword is a 16-bit (word) variable
You can also write this proc using the SSE.
After this proc you have 2 integers, so use user32.dll's 'wsprintf' function like this:
1. "buffer" must be large enough to store the string
2. "format" must be "%d.%d" (note the decimal point between each %d. you can use comma instead of dot)
after all of this you can simply do
:)
1. enable truncation:
fstcw
or , 0300h
fldcw
2. enable rounding to nearest integer:
fstcw
and , NOT 0300h ; 1111110011111111 (0FCFFh)
fldcw
controlword is a 16-bit (word) variable
You can also write this proc using the SSE.
After this proc you have 2 integers, so use user32.dll's 'wsprintf' function like this:
invoke wsprintf, addr buffer, addr format, temp, temp2
1. "buffer" must be large enough to store the string
2. "format" must be "%d.%d" (note the decimal point between each %d. you can use comma instead of dot)
after all of this you can simply do
invoke MessageBox, 0, addr buffer, 0, 0
:)
Thanks a lot for your help :) Still I'm getting some weird results...
Here for example:
But when I use it with my procedure....
No matter what value I enter, I always get -18.-2147483648....
Here for example:
.data
_5o9 REAL4 0.5555555555555556f
.code
invoke FloatToAscii, _5o9, addr buffer
invoke MessageBox, eax, 0, MB_OK ; gives me 1.555556 , almost right
But when I use it with my procedure....
FireToCelc proc x:DWORD
; formula: (0.5555555555555556)*(x-32)
movss xmm0, x
movss xmm1, _32
subss xmm0, xmm1 ; xmm0 = x - 32
movss xmm1, _5o9
mulss xmm0, xmm1 ; xmm0 = (0.5555555555555556)*(x-32)
movss result, xmm0
invoke FloatToAscii, result, addr buffer
ret
FireToCelc endp
FloatToAscii proc float:DWORD, lpOut:DWORD
LOCAL temp:DWORD, temp2:DWORD, cWord:WORD
.data
format db "%d.%d",0
Milion REAL4 1000000.0
.code
; turn to truncation mode
fstcw cWord
or cWord, 0300h
fldcw cWord
fld float
fist temp
fsub temp
; turn to round-to-nearest-integer mode
fstcw cWord
and cWord, not 0300h
fldcw cWord
fmul Milion
fistp temp2
invoke wsprintf, lpOut, addr format, temp, temp2
mov eax, lpOut
ret
FloatToAscii endp
No matter what value I enter, I always get -18.-2147483648....
oops ^^" it's not 0300h but 0C00h (and "NOT 0C00h" instead of "NOT 0300h") :) "0300h" and "NOT 0300h" switch between 64-bit precision and 32-bit precision, respectively.
and do "fisub temp" instead of "fsub temp".
now you should get 0.555556 from 0.555555555555555555555.
remember to NOT multiply by more than 1'000'000'000 otherwise the result will be larger than 0xFFFFFFFF so it won't fit in 32-bit variable. if yuo want more than 9 places, then repeat the steps (fistp->fsub->fmul) to obtain them.
as for the second bug: it must be somewhere inside the SSE function. confirm that the result is correct before calling FloatToAscii.
/edit
the SSE function is correct. 'FloatToAscii' should not end with mov eax, lpOut :) add "mov , temp" and "mov , temp2" or whatever. you're not returning/storing the results, so they get lost.
and do "fisub temp" instead of "fsub temp".
now you should get 0.555556 from 0.555555555555555555555.
remember to NOT multiply by more than 1'000'000'000 otherwise the result will be larger than 0xFFFFFFFF so it won't fit in 32-bit variable. if yuo want more than 9 places, then repeat the steps (fistp->fsub->fmul) to obtain them.
as for the second bug: it must be somewhere inside the SSE function. confirm that the result is correct before calling FloatToAscii.
/edit
the SSE function is correct. 'FloatToAscii' should not end with mov eax, lpOut :) add "mov , temp" and "mov , temp2" or whatever. you're not returning/storing the results, so they get lost.
I don't see why FloatToAscii shouldn't return a pointer to the buffer it altered... I was checking functions like itoa and they do return a pointer to a buffer.... Since after the proc is called the buffer contains the final result I don't see the need to return the two dwords obtained trough the proccess... Still I could be totally wrong and in that case please corect me ;)
I did some minor changes in the code... Now FloatToAscii seems to be working but I can't make it work with FireToCelc... Now it's always returning -17.777779 ...
I did some minor changes in the code... Now FloatToAscii seems to be working but I can't make it work with FireToCelc... Now it's always returning -17.777779 ...
.data
_32 REAL4 32.0f
_5o9 REAL4 0.5555555555555556f
.code
(...)
invoke GetDlgItemInt, hWnd, IDC_FIREN, NULL, FALSE
invoke FireToCelc, eax
invoke FloatToAscii, eax, addr buffer
invoke SetDlgItemText, hWnd, IDC_CELSIUS, addr buffer
(...)
FireToCelc proc x:DWORD
.data
result dd ?
.code
; formula: (0.5555555555555556)*(x-32)
movss xmm0, x
movss xmm1, _32
subss xmm0, xmm1 ; xmm0 = x - 32
movss xmm1, _5o9
mulss xmm0, xmm1 ; xmm0 = (0.5555555555555556)*(x-32)
movss result, xmm0
mov eax, result
ret
FireToCelc endp
FloatToAscii proc float:DWORD, lpOut:DWORD
LOCAL temp:DWORD, temp2:DWORD, cWord:WORD
.data
format db "%d.%d",0
Milion REAL4 1000000.0
.code
; turn to truncation mode
fstcw cWord
or cWord, 0C00h
fldcw cWord
fld float
fist temp
fisub temp
; turn to round-to-nearest-integer mode
fstcw cWord
and cWord, not 0C00h
fldcw cWord
fmul Milion
fabs ; to avoid having numbers like -1.-486
fistp temp2
invoke wsprintf, lpOut, addr format, temp, temp2
mov eax, lpOut
ret
FloatToAscii endp
end start
Sorry, I missed the "wsprintf" line. It was late and I was sleepy :P
As for the FireToCelc: You pass an integer value, while you should pass a single-precision floating point value. There IS a way to make this function work with integers, but it will require SSE2 (load scalar integer -> convert to single-precision float, if I recall correclty). The simpliest (but definitely NOT the fastest) way to convert integer to float is "fild -> fstp" pair with truncation enabled.
As for the FireToCelc: You pass an integer value, while you should pass a single-precision floating point value. There IS a way to make this function work with integers, but it will require SSE2 (load scalar integer -> convert to single-precision float, if I recall correclty). The simpliest (but definitely NOT the fastest) way to convert integer to float is "fild -> fstp" pair with truncation enabled.
FireToCelc proc x:DWORD
.data
result dd ?
.code
; formula: (0.5555555555555556)*(x-32)
fld x
fstp result
movss xmm0, result
movss xmm1, _32
subss xmm0, xmm1 ; xmm0 = x - 32
movss xmm1, _5o9
mulss xmm0, xmm1 ; xmm0 = (0.5555555555555556)*(x-32)
movss result, xmm0
mov eax, result
ret
FireToCelc endp
Keep getting -17.777779 ... I wonder if the error is in "movss result, xmm0", is result large enough to hold xmm0?
fild x, not fld x. you're loading an integer. The SSE part is fine - I've tested it.
movss result, xmm0 moves 32-bit floating point value located at the beginning (bits: 0-31) of the xmm register, NOT the whole xmm register. scalar operations operate on bits 0-31 of SSE registers. they produce single result from single operands. it is 1/4 of SSE's power. Packed SSE is the true power: they produce 4 results from 8 operands (multiple data) from single instruction (hence the name: SIMD ). scalar sse is good way to omit using the FPU. all the things in this topic can be written using scalar sse only.
packed version is: movups or movaps. it loads 128-bits (4 single precision floating point values [16 bytes]) from/(to) a xmm register to/(from) a memory location (either Aligned, or Unaligned, hence the names: movaps and movups). the aligned version requires that the memory operand is 16-byte aligned.
movss result, xmm0 moves 32-bit floating point value located at the beginning (bits: 0-31) of the xmm register, NOT the whole xmm register. scalar operations operate on bits 0-31 of SSE registers. they produce single result from single operands. it is 1/4 of SSE's power. Packed SSE is the true power: they produce 4 results from 8 operands (multiple data) from single instruction (hence the name: SIMD ). scalar sse is good way to omit using the FPU. all the things in this topic can be written using scalar sse only.
packed version is: movups or movaps. it loads 128-bits (4 single precision floating point values [16 bytes]) from/(to) a xmm register to/(from) a memory location (either Aligned, or Unaligned, hence the names: movaps and movups). the aligned version requires that the memory operand is 16-byte aligned.
Oh thanks a lot for the explanation :) May I bother you a little more? :P Now everything is running finely but I would like to know how I would rewrite the FloatToAscii function using sse or sse2... I beleive there's no need to use packed sse here, is it? I would just like to know if there are opcodes to operate on integers on sse, and if so is there some kind of reference with it's opcodes?
To deal with integers inside XMM reigsters you need SSE2.
MMX instructions operate on MMX registers (which are alias to FPU registers) and are integer instructions. They're great to write audio or video codecs, for example (hence the name: Multimedia Extensions).
SSE intructions operate on 32-bit (single precision) floating point values when using XMM registers, and on Integer values when using MMX registers. They're designed for 3d functions where they can aid video cards. They were supposed to be Intel's response to AMD's 3dnow! instructions. 3dnow! instructions use 3dnow! registers (which are alias to FPU registers, like MMX). 3dnow! registers are 64-bit and they hold 2 single-precision (32 bit) floating-point values, each.
SSE2 instructions operate on both MMX registers and XMM registers and are both single-precision FP, and integer intructions. Additinally they support 'integer <-> floating point' conversion and can operate on 64-bit (double precision) floating-point values (and the support the appropriate conversion methods). They're designed to aid advanced maths applications, like speech recognition, etc.
Of course everyone can find their own application to these instructions :)
After this short introduction ( :P ) we see that we need either 3dnow! or SSE2 :) Let's stick to SSE2 (mainly because I'm not familiar enough with AMD's architecture ^^" ).
SSE, SSE2 and MMX instructions can be freely mixed inside the application. You have to remember 2 things, though: some instructions (MMX ones and few SSE/2) require a MMX register as an operand (these are 64-bit operations) and some instructions require a XMM register a an operand (these are 128-bit instructions). The second thing is that when Intel added the SSE2 they also increased the functionality of some already existing SSE and MMX instructions. So if you have SSE2-capable processor, you may -for example- use a SSE instruction with a MMX register, but someone with a CPU supporting SSE, but not SSE2, will get Undefined Instruction (#UD) Exception, regardlles of the fact that it's a SSE instrucion. So you have to remember that SSE2 is not only new instructions, but also an 'upgrade' of the SSE and MMX ones. Don't forget about it if you want your app to work on a wide variety of CPUs. ...Well, to be honest - you don't have to care, because nowadays almost everyone has Athlon XP or Pentium 4 :P But it's nice to check wheter the CPU supports MMX/SSE/SSE2 or not, before using a MMX/SSE/SSE2 function :) Windows XP requires a Pentium-class CPU, so it's guaranteed that the win32 app may use CPUID instruction (you don't have to check everything from 8086 through 286, 386, 486 to pentium :) ).
...But let's get to the point already ^^"
As I've said earlier: Scalar instructions keep their operands and results in low-order dword (bits: 0-31) of XMM registers. Packed instructions use whole XMM register to produce multiple data from multiple operands in one instruction.
To load an integer into a XMM register you use MOVD instruction. MOVD instruction is a MMX instruction, but requires SSE2 to operate on XMM registers. CVTSI2SS instruction converts a scalar integer and stores it in a XMM register. This conversion instruction actually LOADS and converts the value, so we don't need the MOVD here (I mentioned it just for you to know ;) ).
So the only thing you must add is:
(and pray for your compiler to support SSE2 :P )
CVTSI2SS means: Covert Scalar Integer 2 Scalar Single-precision value.
That's it. After this instruction you have a ready-to-work floating-point value :)
You must admit that it's a heck of a long introduction for such a short thing to say :P
Now the FloatToAscii function:
Both the FPU and SSE2 each have their own control register which controls how certain operations are handled. Especially precision and rounding mode. With SSE2 everything is simple: there are seperate instructions for single-precision operations (SSE) and separate ones for double-precision operations. With FPU we can control it by properly setting bits 8 and 9. So simply it means that when we switch to SSE we have one thing less to worry about :) Now the rounding mode: SSE operation depend on the MXCSR register. This register works much like FPU control word, except that this one is 32-bit (FPU's control word is... well.. word :P ). In order to alter it, we need to store it somewhere (just like in case of FPU's CW), set the flags as we want it, and load it back into the CPU. We store the MXCSR with the STMXCSR instruction ("Store MXCSR") and load it back with LDMXCSR instruction ("Load MXCSR"). After we have stored it, we alter the bits 13 and 14 which control the rounding mode (we need to set them both if we need truncaton and clear them both if we need round-to-nearest-integer).
The algo goes as follows:
If I didn't make any mistake, then this should do it.
one word about the conversion instructions: note the double "T" in 3 of 4 conversions used above. this additional "T" means that the conversion should be made with truncation REGARDLESS of the MXCSR register. These are very nice instrustions, because they allow us to perform many differect conversions without the need to alter the MXCSR in any way :)
The above code has a single small dependency (compared to 8 using the FPU code), the code is free of any redundant instructions (like setting/resetting the control word), we don't need the additional variable to store the control word or MXCSR, and ..hey! it's SSE! It's cool to code with :P :)
I hope the above works properly.
You may ask why we need 2 conversions, and how does the whole aglorithm works. Here comes the explaination:
cvttss2si xmm0, float
This first instruction loads the 'float' variable, converts it to integer using truncation (regardless fo MXCSR) and stores it as scalar integer in bits 0-31 inside the XMM0. after this we have integer (which is integral part od "float", because the fractional part gets truncated) in low-order dword of xmm0.
movss xmm1, float
This one is straightforward: it simply moves the "float" into bits 0-31 of xmm1 (scalar single precision value)
cvtsi2ss xmm0, xmm0
this one converts the contents of xmm0. you see: we have stored an INTEGER inside the xmm, bu we need a FLOAT to substract it later. That's why we must convert this integer to float.
subss xmm1, xmm0
now we substract one scalar single-precision from another.
movss xmm2, million
now we load the multiplier (1000000.0)
cvttss2si temp, xmm0
the value of xmm0 (which is scalar single-precision float) gets converted into an integer and stored in a memory operand ("temp")
mulss xmm1, xmm2
now the fractional part gets multiplied
cvttss2si temp2, xmm1
the same is done as it was to the xmm0
Example:
'float' = 321.123
when you display it, you'll get "321.123000"
I think that's all.
MMX instructions operate on MMX registers (which are alias to FPU registers) and are integer instructions. They're great to write audio or video codecs, for example (hence the name: Multimedia Extensions).
SSE intructions operate on 32-bit (single precision) floating point values when using XMM registers, and on Integer values when using MMX registers. They're designed for 3d functions where they can aid video cards. They were supposed to be Intel's response to AMD's 3dnow! instructions. 3dnow! instructions use 3dnow! registers (which are alias to FPU registers, like MMX). 3dnow! registers are 64-bit and they hold 2 single-precision (32 bit) floating-point values, each.
SSE2 instructions operate on both MMX registers and XMM registers and are both single-precision FP, and integer intructions. Additinally they support 'integer <-> floating point' conversion and can operate on 64-bit (double precision) floating-point values (and the support the appropriate conversion methods). They're designed to aid advanced maths applications, like speech recognition, etc.
Of course everyone can find their own application to these instructions :)
After this short introduction ( :P ) we see that we need either 3dnow! or SSE2 :) Let's stick to SSE2 (mainly because I'm not familiar enough with AMD's architecture ^^" ).
SSE, SSE2 and MMX instructions can be freely mixed inside the application. You have to remember 2 things, though: some instructions (MMX ones and few SSE/2) require a MMX register as an operand (these are 64-bit operations) and some instructions require a XMM register a an operand (these are 128-bit instructions). The second thing is that when Intel added the SSE2 they also increased the functionality of some already existing SSE and MMX instructions. So if you have SSE2-capable processor, you may -for example- use a SSE instruction with a MMX register, but someone with a CPU supporting SSE, but not SSE2, will get Undefined Instruction (#UD) Exception, regardlles of the fact that it's a SSE instrucion. So you have to remember that SSE2 is not only new instructions, but also an 'upgrade' of the SSE and MMX ones. Don't forget about it if you want your app to work on a wide variety of CPUs. ...Well, to be honest - you don't have to care, because nowadays almost everyone has Athlon XP or Pentium 4 :P But it's nice to check wheter the CPU supports MMX/SSE/SSE2 or not, before using a MMX/SSE/SSE2 function :) Windows XP requires a Pentium-class CPU, so it's guaranteed that the win32 app may use CPUID instruction (you don't have to check everything from 8086 through 286, 386, 486 to pentium :) ).
...But let's get to the point already ^^"
As I've said earlier: Scalar instructions keep their operands and results in low-order dword (bits: 0-31) of XMM registers. Packed instructions use whole XMM register to produce multiple data from multiple operands in one instruction.
To load an integer into a XMM register you use MOVD instruction. MOVD instruction is a MMX instruction, but requires SSE2 to operate on XMM registers. CVTSI2SS instruction converts a scalar integer and stores it in a XMM register. This conversion instruction actually LOADS and converts the value, so we don't need the MOVD here (I mentioned it just for you to know ;) ).
So the only thing you must add is:
CVTSI2SS xmm0, x
(and pray for your compiler to support SSE2 :P )
CVTSI2SS means: Covert Scalar Integer 2 Scalar Single-precision value.
That's it. After this instruction you have a ready-to-work floating-point value :)
You must admit that it's a heck of a long introduction for such a short thing to say :P
Now the FloatToAscii function:
Both the FPU and SSE2 each have their own control register which controls how certain operations are handled. Especially precision and rounding mode. With SSE2 everything is simple: there are seperate instructions for single-precision operations (SSE) and separate ones for double-precision operations. With FPU we can control it by properly setting bits 8 and 9. So simply it means that when we switch to SSE we have one thing less to worry about :) Now the rounding mode: SSE operation depend on the MXCSR register. This register works much like FPU control word, except that this one is 32-bit (FPU's control word is... well.. word :P ). In order to alter it, we need to store it somewhere (just like in case of FPU's CW), set the flags as we want it, and load it back into the CPU. We store the MXCSR with the STMXCSR instruction ("Store MXCSR") and load it back with LDMXCSR instruction ("Load MXCSR"). After we have stored it, we alter the bits 13 and 14 which control the rounding mode (we need to set them both if we need truncaton and clear them both if we need round-to-nearest-integer).
The algo goes as follows:
cvttss2si xmm0, float ;xmm0 = integral part of 'float' stored as integer
movss xmm1, float ;xmm1 = float
cvtsi2ss xmm0, xmm0 ;xmm0 = integral part of float stored as 'float'
subss xmm1, xmm0 ;xmm1 = fractional part of 'float'
movss xmm2, million ;xmm2 = multiplier
cvttss2si temp, xmm0
mulss xmm1, xmm2 ;xmm1 = multiplied fractional part
cvttss2si temp2, xmm1
invoke wsprintf.... blah blah blah
If I didn't make any mistake, then this should do it.
one word about the conversion instructions: note the double "T" in 3 of 4 conversions used above. this additional "T" means that the conversion should be made with truncation REGARDLESS of the MXCSR register. These are very nice instrustions, because they allow us to perform many differect conversions without the need to alter the MXCSR in any way :)
The above code has a single small dependency (compared to 8 using the FPU code), the code is free of any redundant instructions (like setting/resetting the control word), we don't need the additional variable to store the control word or MXCSR, and ..hey! it's SSE! It's cool to code with :P :)
I hope the above works properly.
You may ask why we need 2 conversions, and how does the whole aglorithm works. Here comes the explaination:
cvttss2si xmm0, float
This first instruction loads the 'float' variable, converts it to integer using truncation (regardless fo MXCSR) and stores it as scalar integer in bits 0-31 inside the XMM0. after this we have integer (which is integral part od "float", because the fractional part gets truncated) in low-order dword of xmm0.
movss xmm1, float
This one is straightforward: it simply moves the "float" into bits 0-31 of xmm1 (scalar single precision value)
cvtsi2ss xmm0, xmm0
this one converts the contents of xmm0. you see: we have stored an INTEGER inside the xmm, bu we need a FLOAT to substract it later. That's why we must convert this integer to float.
subss xmm1, xmm0
now we substract one scalar single-precision from another.
movss xmm2, million
now we load the multiplier (1000000.0)
cvttss2si temp, xmm0
the value of xmm0 (which is scalar single-precision float) gets converted into an integer and stored in a memory operand ("temp")
mulss xmm1, xmm2
now the fractional part gets multiplied
cvttss2si temp2, xmm1
the same is done as it was to the xmm0
Example:
'float' = 321.123
cvttss2si xmm0, float ;xmm0 = 321
movss xmm1, flaot ;xmm1 = 321.123
cvtsi2ss xmm0, xmm0 ;xmm0 = 321.0
subss xmm1, xmm0 ;xmm1 = 0.123
movss xmm2, million ;xmm2 = 1000000.0
cvttss2si temp, xmm0 ;store 321
mulss xmm1, xmm2 ;xmm1 = 123000.0
cvttss2si temp2, xmm1 ;store 123000
when you display it, you'll get "321.123000"
I think that's all.
Thanks a lot for the wonderfull explanation :) It'll be a refference to me from now on...
Your code didn't compile, so I had to change it a little...
Thanks you once again :)
Your code didn't compile, so I had to change it a little...
FloatToAscii proc float:DWORD, lpOut:DWORD
LOCAL temp:DWORD, temp2:DWORD
.data
format db "%d.%d",0
million REAL4 1000000.0
.code
cvttss2si eax, float ;xmm0 = integral part of 'float' stored as integer
movss xmm0, float ;xmm1 = float
cvtsi2ss xmm1, eax ;xmm3 = integral part of float stored as 'float'
subss xmm0, xmm1 ;xmm1 = fractional part of 'float'
movss xmm2, million ;xmm2 = multiplier
mov temp, eax
mulss xmm0, xmm2
cvttss2si eax, xmm0
mov temp2, eax
invoke wsprintf, lpOut, addr format, temp, temp2
mov eax, lpOut
ret
FloatToAscii endp
Thanks you once again :)
Oh yes - I forgot that the conversion instructions must reference the GP registers as their destinations 

The source operand can be an XMM register or a 32-bit memory location. The destination operand is a general-purpose register.
Hehe I can't sleep so I'm trying to improve the algo... I'm quite pleased with the function, but there are some results that are annoyng me... They are having zeros at the end of decimal places ie: 1.990000 and negative numbers having two negative signs ie: -7.-149 ....
I have an idea on the first one: Get how many decimal digits are there, like there are 4 decimal digits in 0.1234, then use that number to select a member of an array that has x zeros.... Look:
The problem here is how to get the number of decimal digits... The only way I can think of is to convert it into a string and then strlen it... But that would be terribly unefficient...
I don't know how I would solve the second one tough, do you know wich bits store the sign of the number?
Here's the code so far...
I have an idea on the first one: Get how many decimal digits are there, like there are 4 decimal digits in 0.1234, then use that number to select a member of an array that has x zeros.... Look:
array REAL4 10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0
num = 0.1234
len = getDecimaldigits(num) ; len = 4
mulss xmm0, array[(len-1)*4] ; xmm0 = xmm0*10000.0
cvttss2si ecx, xmm0 ; ecx = 1234
(etc...)
The problem here is how to get the number of decimal digits... The only way I can think of is to convert it into a string and then strlen it... But that would be terribly unefficient...
I don't know how I would solve the second one tough, do you know wich bits store the sign of the number?
Here's the code so far...
FloatToAscii proc float:DWORD, lpOut:DWORD
.data
format db "%d.%d",0
million REAL4 1000000.0
.code
cvttss2si eax, float ;eax = integral part of 'float' stored as integer
movss xmm0, float ;xmm1 = float
cvtsi2ss xmm1, eax ;xmm3 = integral part of float stored as 'float'
subss xmm0, xmm1 ;xmm1 = fractional part of 'float'
mulss xmm0, million
cvttss2si ecx, xmm0
invoke wsprintf, lpOut, addr format, eax, ecx
mov eax, lpOut
ret
FloatToAscii endp
ti_mo_n,
Why don't you save what you have written into our x86 book? ;)
Regards,
Victor
Why don't you save what you have written into our x86 book? ;)
Regards,
Victor