I'm working on a project that requires a lot of registers. Problem is im porting it to the core2duo which doesn't have as many. I've read around and some sources make it sound like you can read registers on another core. This would be ideal since it would allow me to store almost all of the registers across both cores. The intel instruction manual makes references to this type of activity however i could only find information on passing interupts. Is there a way to do it using XCHG? (sounds very messy). Is what i'm asking even possible?
Just to clarify, i'd really like to establish two core specific threads (high level through windows api possibly) each holding a part of the registers and doing only operations specific (or as close to possible) to those registers for each thread. Obviously some operations will require use of more registers than is available on one core alone, and using a unified register file in system memory would make my program much slower... thus leading to my original question of whether a core can snoop into another's registers. As a last resort, being able to cache my register file on the cpu would be enough but i'm also a little clueless as to how that's done. Any links, books, or info would be greatly appreciated :)
Posted on 2008-03-11 13:42:33 by elokide
Afaik, no, you can't simply "share registers" across cores. At the very low-level operation, you can send "inter-processor interrupts" (IPIs), but I haven't yet gotten around to how exacgtly that works yet... under an OS, you work at a very different level, anyway.

XCHG with memory is atomic, and involves a bus lock - relatively costly. You probably won't get around using some memory... core2 does cache sharing, but you need to probe cpu topology if you want to take advantage of that - and you probably don't want do depend on that feature anyway.

Can you elaborate a bit on what you want to achieve? :)
Posted on 2008-03-11 19:39:14 by f0dder
In my opinion, using more than one core in a multicore CPU should be treated as operating separate threads and you need to communicate between the different threads through memory. For instance, if you have two cores, each should be working full time on different parts of your application and all registers should be available for the code at hand. Even if you could do what you describe, you wouldn't really want to freeze some of the registers in one thread only to make them available to the other thread.
Posted on 2008-03-11 20:25:14 by Raymond
Thanks to both f0dder and Raymond for your replies. I guess Raymond's right about the freezing of certain registers. It might eliminate the point of having threads :).
In reply to f0dder, I'm working on a cpu interpreter as a hobby to learn about cpu architectures. Since the core2 has so few GPRs (16 per core) compared to the cpu i'm trying to interperet (powerpc: 32 GPRs) i was thinking of taking advantage of the other core's registers. The other idea I had before learning more about the intel architecture was to maybe store two 32bit powerPC registers in one 64bit intel one, but from what i've read it is impossible to reference the upper 32bits of a 64bit register apart from the lower ones. Worst case would be to only store the most used registers on the cpu and the rest in memory. Fun ideas anyway :).
Posted on 2008-03-11 20:46:30 by elokide
Dynamic recompilation is the only fast solution to what you want to do afaik.
Posted on 2008-03-11 21:41:18 by Ultrano
I guess I really shouldn't try to make a "fast" interpreter anyway given that it's no match to dynamic recompilation. Thanks for the tip, I'll most likely delve into that after I get a feel for the architecture. Thanks all for the help  :D
Posted on 2008-03-12 03:23:23 by elokide
It's probably smartest to start by doing a "slow-but-working" interpreter, then study up on recompilation if you want to make things fast :)
Posted on 2008-03-12 08:29:18 by f0dder