I was wondering if it was possible to catch 'illegal instruction' exceptions, emulate the instruction, and return to normal program flow. :notsure: That way, I would be able to execute SSE1/2/3 on processors that don't support it.

Any ideas? Thanks!
Posted on 2004-06-04 14:27:44 by C0D1F1ED
Posted on 2004-06-04 16:28:24 by ThoughtCriminal
...and it will be extremely slow :)
Posted on 2004-06-04 17:55:02 by f0dder

...and it will be extremely slow :)

haha, you telling me? I disabled my hardware FPU on Win98 on my P133, after that I quickly scrambled to turn it back on!
Posted on 2004-06-04 18:06:59 by x86asm
on a tangent, have you ever tried disabling your cache? a pentium-mmx-200 is suddenly unable to play MP3s... or pretty much do anything at all ;)
Posted on 2004-06-04 18:10:40 by f0dder

on a tangent, have you ever tried disabling your cache? a pentium-mmx-200 is suddenly unable to play MP3s... or pretty much do anything at all ;)

lol, ya I did that as well, I did it to make Zone66 playable on my 133( that mini 3D scene in the credits ran a lot slower too!), ah it was quite funny to watch your hi-end PC at the time become a useless piece of crap, ahh the good ol' days!
Posted on 2004-06-04 18:12:05 by x86asm

Thanks! But...

Do you have any technical information on how to do it? I mean, is it possible to write the exception handler in C++, or does it have to be an interrupt handler at ring0 level or something? When the exception is caught, how do I get a pointer to the invalid instruction?
Posted on 2004-06-05 03:09:19 by C0D1F1ED


haha, you telling me? I disabled my hardware FPU on Win98 on my P133, after that I quickly scrambled to turn it back on!

In my case, I still assume the presence of an FPU. I've emulated SSE instructions using the FPU before. Actually it's an automatic feature of SoftWire. It's slow, but still workable.

The only reason I can see why it would be slow is because of the exception handling mechanism. But I don't care too much...
Posted on 2004-06-05 03:13:46 by C0D1F1ED
As far as I know, the Windows SEH should be able to handle it. So you cannot use C++'s try/catch, but you can use the __try/__except extensions in C/C++.
But yes, prepare to take a hit of many cycles (1000ish?) per instruction. You do NOT, I repeat NOT want to do this.
For high performance you want to recompile the code, which should be reasonably straightforward with Softwire.
Posted on 2004-06-05 05:54:58 by Scali
It should definitely be possible with SEH - dunno how to do this with the MSVC __try/__except extensions, but setting up a SEH frame in assembly is trivial.

I guess it would be interesting to see what kind of speed you get... one thing is the emulation of the instructions themselves, that _could_ be reasonably acceptable I guess, but the overhead of exception handling should be pretty awful - at least if you don't implement some sort of dynamic code recompilation... x86->x86 JIT'ing, ick ;)
Posted on 2004-06-05 06:09:16 by f0dder

As far as I know, the Windows SEH should be able to handle it. So you cannot use C++'s try/catch, but you can use the __try/__except extensions in C/C++.
But yes, prepare to take a hit of many cycles (1000ish?) per instruction. You do NOT, I repeat NOT want to do this.
For high performance you want to recompile the code, which should be reasonably straightforward with Softwire.

Thanks for the information!

High performance is not the priority now. It's compatibility, even with the oldest processors.
Posted on 2004-06-05 06:40:05 by C0D1F1ED
Obviously C/C++ is even more trivial than asm :)
It's documented well in MSDN.

Here's my tip: Tell people who don't have SSE support to upgrade asap and don't care :P
SSE is at least 5 years old? About time they supported it :)
And why would you want to emulate SSE2 or SSE3?
SSE2 is more of an x87 emulator than the other way around. Write x87 code?
As for SSE3, it's not like there's a lot of support for that yet. Ignore it completely for the time being? :P
Posted on 2004-06-05 06:41:22 by Scali

- at least if you don't implement some sort of dynamic code recompilation... x86->x86 JIT'ing, ick ;)

That's a very cool idea!

But how could I insert new code in a running executable? The shortest SSE instruction is three bytes or so, so I can't place anything useful in between. Unless... I also 'erase' some of the instructions that follow below it, and replace the SSE instruction plus these instructions with a call to a function that performs the same actions.

Would that work or am I overlooking some nasty problems? Replacing jumps would be pretty hard I think...
Posted on 2004-06-05 06:45:47 by C0D1F1ED
Well, if it's for softwire, couldn't you make it output the x86/x87 code sequences to handle the SSE/2 instructions, rather than depending on #UD ?
Posted on 2004-06-05 06:48:11 by f0dder
But how could I insert new code in a running executable?


You should do it the way any JIT compiler does it... Work on a per-function basis. First you analyze the code so you know where every function starts and ends, and where each one is called.
Then when you get an exception, you figure out from the EIP value what function that must have been. Then you recompile the function as a whole, in a new part of memory, and update all references to that function in the code.
Posted on 2004-06-05 06:56:24 by Scali
You could combine a run-trace (ie, catching #UD and emulating, taking note of faulting addresses) with static code translation - something along the lines of z0mbie's mistfall (ie, disassemble the entire executable, translate the offending instructions, fix references, reassemble).

Or you could go with the more traditional approach, as scali described. Remember to cache information on disk, to speed up next run of the executable.

Have a look at how They did x86->Alpha JIT'ing... can't remember where I found it, but google might help :)
Posted on 2004-06-05 07:13:46 by f0dder

As far as I know, the Windows SEH should be able to handle it. So you cannot use C++'s try/catch, but you can use the __try/__except extensions in C/C++.
But yes, prepare to take a hit of many cycles (1000ish?) per instruction. You do NOT, I repeat NOT want to do this.
For high performance you want to recompile the code, which should be reasonably straightforward with Softwire.


Doesn't SSE code usually appear in blocks?

Is there a way for you to know the start and the end of the block?

If so you could trip the exception only once when the first instruction is reached, then use other means to process the rest of the block.
Posted on 2004-06-05 09:31:34 by ThoughtCriminal
I think it's pointless discussing this anyway, since you will make slower systems slower.
I would rather create optimized x87 code that all CPUs can run, than convert fast code for already fast CPUs to extremely slow code on slow CPUs.
And if the x87 code cannot be made fast enough on even the fastest CPUs, so SSE+ is actually required, then I wouldn't bother supporting other CPUs anyway. It's like porting a DVD player back to a 486... It will never work anyway, so who cares if the 486 can actually run the code or not?
Posted on 2004-06-05 10:13:20 by Scali

I think it's pointless discussing this anyway, since you will make slower systems slower.

But compatible.

Besides, I'm currently mostly concerned about SSE2 and SSE3. Both can be emulated fairly efficiently on a Pentium III. But anyway, making them run is far more important now that making them run fast. I want to use the instructions, not their speed. I don't want to buy a Prescott to be able to write Prescott code.
Posted on 2004-06-06 08:52:07 by C0D1F1ED
Use assemble/code-generation-time macros, preprocessing, whatever? It should be easier to write than an emulation system, and the speed should definitely be better, too.
Posted on 2004-06-06 09:01:02 by f0dder