Can someone confirm this for me:

- when using the fpu instructions, i.e. 'fild', can you either load st(0) with either an immediate value, or from a register, like this:
[size=12]

fild 0FFh
fild esi
[/size]



- when popping values back of the fpu, can you pop them direct into a register like this:
[size=12]

fistp eax
[/size]


My tests show that the above are not possible, but there is always the chance that i have missed a trick....
Posted on 2002-07-11 04:42:20 by sluggy
You can't use immediate values with the coprocessor only operands in memory or it's own registers(st(*))

Since the coprocessor is logically separate from the main processor:confused::grin: it can't "see" the main processor's regisers.
Posted on 2002-07-11 04:51:20 by MArtial_Code
Yeah, i thought something like that was the case. I was just doing some quick integer math, and found it annoying that i had to declare a couple of temporary dword variables just to transfer values between the fpu and the normal registers.
Posted on 2002-07-11 05:03:10 by sluggy
Sluggy, this kind of code takes an integer in eax, does some fpu, and gets the result back in eax with relative ease:

push eax
fild dword ptr

; more fpu ops

fistp dword ptr
pop eax
Posted on 2002-07-11 05:44:22 by eGo
you might be interested in my FPU tutorial, it sits at
antipasta.topcities.com/fputut.txt
Feedback appreciated
Posted on 2002-07-11 14:43:35 by AntiPasta
There *is* one instruction which allows access to the CPU registers, FSTSW. You can write the status word to AX. It is intended to be used with SAHF IIRC so you can use the jxx series (which are normally for integer comparations) on float comparations.
Posted on 2002-07-12 20:15:51 by AmkG

you might be interested in my FPU tutorial, it sits at
antipasta.topcities.com/fputut.txt
Feedback appreciated


Good tut AntiPasta. I'll take a look at it myself. I'm no veteran for FPU coding but I can do it and this will certainly enhance my skills. Just one question, what do you mean by the FXCH (FPU exchange instruction) using ZERO clock cycles when paired correctly?
Thanks!
Posted on 2002-07-13 20:26:57 by x86asm
what do you mean by the FXCH (FPU exchange instruction) using ZERO clock cycles when paired correctly?


I'm not the author of the tutorial, but... :)

That means, fxch is implemented as register renaming at the lowest level. Yes, it takes decoding time, but you don't incur execution time. AFAIK, fxch on P5 is pairable with most of FPU instructions. On P6, it does not cost you other than the decoding time.

Be careful and don't abuse this feature. Decoding time may be longer than you might expect. From my experience, loading bunch of values and fxch to avoid latency can be slower than sequential processing. Another example of this is the drastic performance loss in gcc 3.x compiled C code compared to the code generated by previous versions of gcc.
Posted on 2002-07-13 20:47:31 by Starless
Thanks starless.!! Really needed the help :D
So what instructions should I avoid pairing FXCH with? Should I avoid pairing it with instructions that sue both operands in the FXCH ins?
Posted on 2002-07-13 21:04:54 by x86asm
So what instructions should I avoid pairing FXCH with?


I was wrong about saying 'most of FPU instructions'. Darn, my memory decay parameter is so large! :( Checking Agner's note gives me the following list of instructions pairable with fxch:
fld, fadd, fsub, fsubr, fmul, fdiv, fdivr, fchs, fabs, fcom, fucom and
associated 'pop stack' version of them.

Remember, this is for P5. P6 does not have the concept of 'pairability'. If your target is not P5, don't mind the pairability.

And, here is a fishing rod: Get Agner's optimization note. You will find yourself reading it over and over again soon. :)
Posted on 2002-07-13 21:26:35 by Starless
x86asm, I think Starless means:
fld xxx

fld xxy
fld xxz
...
fxch ?
...
fxch ?
...
fxch ?
verses
fld xxx

...
fld xxy
...
fld xxz
...
...
In the last paragraph he is saying the second method has less decode bandwidth, and hence better performance where decode bandwidth is the bottleneck.
Posted on 2002-07-13 21:28:53 by bitRAKE
Exactly, bitRAKE. :)
Posted on 2002-07-13 21:40:38 by Starless