I was just looking over some MMX and SSE stuff when a question about the value of these instruction sets came to mind. What instructions does the video card actually handle? If it does the math for the transforms, is there any reason to use MMX or SSE since those are intel instructions? Do you guys reccomend learing them? I'm just curious because I don't want to invest the time into learning those sets (at the moment) if they are not going to help the speed of my games. I make games for fun, and since I have HW acceleration, I'm not worried about making something that runs without it...well, maybe someday when I'm a decent asm programmer and I want to see what I can do :grin:
Posted on 2001-10-30 20:57:56 by AlexEiffel
MMX is intel, but supported on all modern processors. It's for working
on packed sets of data in 64bit registers. It's not only used for
game stuff, "serious" image processing and fast implementations
of some crypto algorithms use it as well.

SSE is sorta the same, just for floats. Contrary to AMDs 3dnow!,
it retains full precision (afaik), so you can use it for "serious" stuff
as well.

Sure, today the gfx cards do the transform and lighting. Which means
you limit yourself to gfx card speed, not processor speed. Right
now this is probably sensible :), but perhaps not in the future?
And some 3d stuff might be a bit hard to do with T&L? Dunno...
haven't dabbled much in 3d yet.
Posted on 2001-10-31 05:03:18 by f0dder
Hey f0dder, thanks for the information :) So if I coded something like a 4x4 matrix multiply with MMX, would I see the speed increase? Or does the GFX card handle the matrix stuff and ignore the MMX code? I'm getting familiar with how the MMX instructions work (the intel interactive tutorials aren't actually that bad) but I think my problem is that I don't understand how hardware acceleration works.
Posted on 2001-10-31 05:18:04 by AlexEiffel
The hardware acceleration works depending on the API you use.
If you just use DirectDraw, you'll get no 3d acceleration. If you use
OpenGL or D3D, you will get 3d acceleration. Now, the idea behind
hardware T&L is that you throw some vertex buffers at your gfx
card. You want to do this as seldom as possible, as it takes some
bandwith. And then you can throw deformation matrices at the board,
which will then do the transformations in hardware.

Once the data is on the board, it's the board (GPU) handling the
data, not the CPU (you). Which of course means that you'll be bound
by GPU speed (but this is often satisfactory, of course.) Imagine you
put a geforce3 card into a *massively evil* computer. You'd be able
to get more fps if you did software T&L. I dunno how massive this
computer would have to be, though :). Also, the lighting part of
T&L seems to be a bit slow, at least on geforce2? I guess it depends
on how it's implemented.

Now, this is all just ramblings, based on how I have interpreted
clever people telling me about this :). I'd love to know any reasons
why T&L might not *always* be preferable (apart from the "massively
evil CPU")... and stuff.
Posted on 2001-10-31 05:31:16 by f0dder
I can see only one time that hardware Transformation could be slower. When you need to use the same transformed data set for collision detection an physics, since the card will not let you get back the transformed vertex. (Even it could, reading from a card is always much slower than writing to it).

One migh think tat letting the hardware do a matrix multiplication and then get it back whould be faster. But the steps involved are:

1. Send the first matrix to the card.
2. Send the second matrix
3. Do the multiplication on the card.
4. Get back the result from the card.

Since each of them requires to pass data tough the AGP bus. And the last one requires reading from it (Slower). While the CPU is waiting for the card to finish, (since it needs the matrix to transform the vertex).

So I think that 99% of the time hardware transformation will be faster. But there is 1% that is not.

As for hardare lighting. Hardware lighting is faster. Unless you have one light more than the hardware can take. Then precanned lightsources are faster.
Posted on 2001-10-31 11:15:46 by dxantos
You should learn them because compilers can't generate good MMX and SSE code
Posted on 2001-11-01 15:19:17 by Dr. Manhattan
Hey Dr. Manhattan, I think I will probably learn them, but I see one weakness to your point about the compiler not being able to generate good MMX and SSE code....I can't either ;) Heh, I can't generate good normal code for that matter. That will come though. By the way, I know that a good c program with a good compiler will produce better code than I am writting, but how much better is bad c code than bad (newbie) asm code?
Posted on 2001-11-01 23:25:16 by AlexEiffel
This question should be asked in the "Holy Wars" forum !By "good" MMX or SSE code, I was thinking about code that exploits the parallelism of these instruction sets. Something compilers can't do because they are essentially sequential.
Posted on 2001-11-02 02:23:00 by Dr. Manhattan
AlexEiffel, don't worry about that - it still requires a good programmer to use a good compiler to produce good code. :)
Posted on 2001-11-02 04:13:38 by bitRAKE
AlexEiffel, you shouldn't to code your games using SSE, because AMD processors don't support it.
Posted on 2001-11-02 17:46:57 by Aquila
Hi Aquila.
Hmmm...I can see how that would limit things. But like I said before, I'm making games for myself really, and I'm trying to do them in asm mostly for the learning experience of it (not that I don't understand the other benifits to doing it in asm)....well, perhaps the bragging rights too :grin: So I think I will still learn the SSE stuff, though I see your point of compatibility. If I write something I want to sell, I will definately take your advice though. Thanks! :alright:
Posted on 2001-11-03 01:01:42 by AlexEiffel