Mr. Steve Bush hasn't posted the stuff here, so I'll do it for him. This is the website for the WIZ processor. A simple and interesting piece of equipment.
The WIZ
The is the thread in the MASM Forum about the WIZ.
MASM Forum -> Soap Box -> The WIZ
I'll let Steve know I did this.
Posted on 2004-04-13 20:00:22 by NoDot
OK, hello to all! I don't know much about this forum, what kind of people are here, etc. I'll look around...

In the meantime, let me introduce my invention, the WIZ processor. It's at:
http://www.steve.bush.org/WizdomR&D/index.html

It is a technical specification for a processor chip of a very different design. It isn't compatable with any x86 type processors. It isn't even similar!

If you're interested in such things, please check it out! I'm looking for some intelligent conversation about it. Comments, thoughts, ideas. Not too many flames I hope, but critical ideas are always welcome...

Please beware: my documentation is terse and difficult to read out of order. The WIZ processor is very different from anything you're probably used to, and a single page read out of context will be confusing.

The best introduction is NOT the "Introduction" section, but under "The WIZ Processor" in the three pages of the "Overview" there!

Thanks for listening...

Steve Bush.
Posted on 2004-04-15 02:08:54 by Steve Bush
Steve, I had a look at the stuff and must say I'm a bit sceptical. I'm not much into the eletronics side of things, so I can't comment on that - but it seems to me that while you're saying "KISS", you are in reality just moving the complexity to separate components. It might (or might not) be easier to test each component individually, but to have actual work done you still need some "total complexity".

Also, your idea of running 'asynchronously', ho humm. You still have to wait for a 'register' to signal 'ready' before you can move on, right? What good is it that the main core can run at 10ghz if it has to wait for 'register ready' all the time - surely things like adders and memory access will run at lower rates. I guess you could alleviate this somewhat to only wait on dependant registers, but that increases complexity of the core, and you will still be stalling very often, as I see it.

Furthermore, it seems like programming the WIZ won't be all to joyful - at least without some form of abstraction (like a macro assembler). The design (over?)simplification puts a lot of burden on the shoulders of the programmer.

Of course, all this has to be seen in context of what you want to achieve - cheap and efficient microcontrollers, not an x86 replacement. I'm not familiar with microcontrollers, but I do realize they have a lot of requirements on size, power consumption, and program size...
Posted on 2004-04-15 03:32:55 by f0dder
f0dder
Posted on 2004-04-20 14:12:28 by mrgone
First of all I would think waiting for a reply would take longer than the assumption that goes with clock synchronicity.
Second, if the is no external interrupt triggering mechanism than how does this keyboard work if only internal registers can trigger software interrupts? Is it a polling method?
Is this all theory or do you have some working proto-types where some of this has been proven?
Also in the stack, can you point past the bottom or is it just push/pop?
Posted on 2004-04-20 14:31:55 by mrgone
mrgone, thanks for your questions.

As to waiting, two points. First, the most common case is where both registers are "ready" even before the instruction begins. In this case, the ready bits are high even as we begin driving the bus, so in reality there is no wait for anything, the bus just "goes".

Secondly, a device like an adder or multiplier "knows" when it is done and can generate a ready signal. Waiting for this signal isn't like waiting for some far off buffered device transmitting a signal down a cable -- we are talking about a single transistor driving a short line from one spot to another within a chip. When the adder goes ready, it pulls that line high, and a single AND gate says both ready's are high and the frontend moves to the next instruction. This is literally picoseconds, the time of a single gate delay and a short transmission delay.

You said "waiting for a reply would take longer than the assumption that goes with clock synchronicity." The assumption that goes with clock synchronicity is that the clock has to be calibrated to the worst case add. An adder speced to say 5 ns can probably vary anywhere from 2 to 5 ns, with 5, the spec, being its worst case (adding a zero is a lot faster than adding all ones). But it would have to be clocked with a 5 ns clock always. The time spent "waiting" for a reply is picoseconds. Surely worth the extra few picosends to save nanoseconds.

Now, to your question: " if there is no external interrupt triggering mechanism than how does this keyboard work if only internal registers can trigger software interrupts?".

Well, the keyboard is connected to an internal register, which in turn causes the interrupt. There would probably be a keyboard-status register, a keyboard-lastKeyPressed register, and maybe a keyboard-interruptAddress register. If the status register said we are enabled, and a key was pressed, we'd latch it into the lastKeyPressed register and flag an interrupt.

I am still working on the "Interrupts" page, so I should be more clear on it (hey, the page does say "This page very much under construction"). Of course there can be external interrupt lines. I'm just saying they don't connect directly to the frontend interrupt mechanism, they go through some logic. On most all processors this is also true, but the logic is thought of as being part of the machine rather than seperated into a frontend and a backend in registers. Minimally, you want to AND the interrupt with an "enable" status bit, so you can control it. Most processors have some soft of "interrupt controller" which minimally checks enable status, and also handles some simple priority schemes, etc. On the WIZ, this is the backend logic of the "Interrupt-status" register, if you want one. That's all I meant by the comment that a register causes an interrupt, not an external line. There IS an external line, it just goes "through" a register on its way to the front-end trigger.

On a truely minimal WIZ implimentation, you could have a single interrupt line which goes straight to the frontend interrupt latch, it would be a non-maskable interrupt, and you'd also have no interrupt-address register, so you'd have to default to something like location 0 for the handler. That would all be ok, and actually, since I do intend the WIZ for a niche of minimal applications (WIZ in a shoe to count steps), it might even be common. I will change my "Interrupt" section to note this -- thanks for the question...

Now, on to "in the stack, can you point past the bottom or is it just push/pop?". No, it is just push/pop. You don't point to it at all. There is no stack pointer, the "stack" is a single register. You write to it to push, read from it to pop. It handles the rest independantly.

Now this is a hardware supplied stack, the advantage of which is that it can be blazingly fast (it runs at internal flip-flop speed instead of memory speed). The STACK is another register which would typically be "ready" even before the instruction to load it started, and so would run at full WIZ speed. (Although, to be fair, a deep set of pushes which overflowed the hardware buffer size will slow it down to memory speed again, but we hope that won't happen much.)

You can ALSO create a "regular" stack if you want: you use a register as a pointer, point to a section of memory, you increment/decrement the pointer, etc. That works too. I just wanted to ALSO provide a hardware stack because straight push/pop operations are common and could benifit greatly from a speedup.

And finally, it would be possible to implement a STACK hardware register that worked more like "tranditional" ones, which we be a bit faster and easier than doing it all yourself, but still basically limited to memory speeds.

Accessing the n'th item down the stack, in a direct manner, is often used to retreive arguments in many argument passing conventions. While you could still do that, I would put forth that because the hardware stack is so much faster than a memory based stack, if you adopted a slightly different protocal in subroutines you would still benifit greatly. That would be, push stuff on the stack as always, but in the subroutine header, instead of accessing the stack directly and then decrementing the stack pointer by N before returning, you could just do "T1 <= STACK; T2 <= STACK; T3 <= STACK; etc; TN <= STACK;" where TN = N'th temp register and N = number of arguments. If N isn't too large, this is still going to be way faster than memory accesses, I think.

Finally, yes, I have no bananas. This is all theory. Well, I have written an assembler and emulator, and successfully executed code, but that emulator is at register level not at gate level and embodies no circuitry or timing. All I have used it for is to test basic programming concepts, like can you write a sort algorithm with this register set, etc.). I'm looking for funding to build a prototype. If you know anywone with a spare couple of hundred thousand dollars, please let me know!!!!

Steve.
Posted on 2004-04-20 20:10:28 by Steve Bush
Well it sure sounds like you got your ducks in a row. This may be a rediculous question but you do have these concepts patented? If all works out in theory than I would think your on to something. I may have some contacts for you. I wish I were envolved in this actually. It sounds so interesting. I've always wanted to get my hands on the Xsistor schematics of say Intel's 4004. There are some circuits I'm not sure of like the Z80's micro-program controller and it's rival counterparts, you know, other processors. I beleive it's just an internal ROM but what does the program do? We want source code...lol.

P.S.
I like that name "temp register", it's sounds better than the accumulator or ich..the W register.
Posted on 2004-04-21 14:35:09 by mrgone
As I have posted lots of details of the WIZ on the www for several years, it cannot be patented. It is freely available for anyone to steal! I'm not too worried; (1) its hard enough to get anybody to be as interested in it as I am, and (2) its not cheap to manufacture a processor, only a bigco (intel, ibm, arm) has the capital to really do it right, and they are so locked in to their own hell, er I mean, way of doing things, that they wouldn't be interested. So I just post it all!

I have found a business partner and market niche (a small, radio-linked, self powered wiz processor embedded in primitive devices like switches, sensors, motors, etc, which would then be OEM'd to refridgerator manufactures and the like), and we are looking for funding to do it ourselves. Actually, I think I could get it manufactured in slow and minimal form on a FPGA or some such thing, for something like $20 thousand. But I also need a salary for myself for a year! And then we have to market it, that means another person's salary. Probably $250,000 is minimum to "startup", for first year, and then you don't even get a product until second year, etc, etc.

So right now, it remains a spare time "hobby". If you are interested, I'd be happy to talk more. But as Tom Cruise put it, "Show me the money!" is the bottom line!

Another option is for me to walk into a bigco, especially ARM, and say, "hire me and let me develop this product for you". I'm not looking to make millions, I just want a good job and a steady salary. And maybe a piece of the profits in the deal... That is kinda my backup plan.

PS: Our very rough draft "business proposal" is here:
http://www.steve.bush.org/WizdomR&D/WIZObjectProp04012004.doc
Posted on 2004-04-21 17:04:39 by Steve Bush
Yeah, unfortunately the people I know won't budge without a patent. I'm persuing another patent right now. Also if you need help with anything let me know. I'm an old hand with RF. One of my patents is an RF product. I don't see why you culdn't begin a patent because you know AMD copies everything Intel makes. Do a seach on new concepts of processor and if not submitted than start it. Your investors will be wanting something unique with some protection to go along with it.
Posted on 2004-04-22 11:03:05 by mrgone
I've talked to some patent lawyers about this. The thing is, I have fully disclosed everything on the internet for several years. This makes it un-patentable, as I have legally shown my intent to "give it all away". Besides, with a wierd sort of notion of honor, I dont want a patent. I think science should be freely shared. Note the "copyleft" symbol at the bottom of all my pages. You can click on it to read the definition of what that means; but it basically means "Please copy - copy all you want! (just don't take it out of context and always continue to give me credit)".

Anyway, thanks for your support. Check out my website from time to time and I'll keep you posted on the latest developments!

Steve.
Posted on 2004-04-22 15:07:24 by Steve Bush
The MASM Forum thread may be done, but at least this one isn't.
Posted on 2004-04-22 19:07:44 by NoDot
I have an idea to start a new thread... But I'm at my day job now, it'll have to wait until tonight...

New thread coming! News at 11...
Posted on 2004-04-22 20:08:46 by Steve Bush
Sorry, no time... Answered the thread in the MASM forum instead.

I was thinking of starting a new thread more like:
Dear friends: I have this processor, I call it the WIZ, its not x86 compatible, and it has this funny assembler language. I want to test the language and see if it is hard or easy to express various snippets of code. I want to see if I left anything out, like if there's something that can't be done or is hard to do.

So what I'm looking for you to do to help me is to submit to me some little snippet of asm code, and I'll try to re-code it in WIZ assembler language. Just to see if I can, if it takes too many instructions, if it is hard, etc. And please give me a higher level description (like C equivalent) too, so I know what it's *supposed* to do! I'm not an x86 asm programmer, but I can follow it ok. But this isn't a contest to see if I can figure out your ingenious code - please make it some *simple* stuff!

I'll start by posting this bubble sort algorithm from the Art of Assembly Language. Here is my version:

... cut/paste here ...

and here is the asm version:

... cut/paste here ...

OK, now its your turn. Anybody got a (preferrably very small and simple) snippet of code for me to try?

What do you think, would that make a good post? What section of the forum should I post it in?

Steve.
Posted on 2004-04-23 04:23:07 by Steve Bush
Hi

This is my code :
movaps ,xmm0

What it does is a (aligned) 16-byte copy to memory, from an XMM register (128 bits).
I do not remember the number of bits you intend to implement. 256 would be great, for ASCII chars for example, and for a full 1-byte handling.

Another one :
pmovmskb eax,mm0. IIRC, AMD64 made the same with xmm0. Here mm0 is 8 bytes.
What it does :
it takes the highest bit of each byte of mm0 (8 bytes).
It puts these to al (8-bit lower part of 32-bit eax) and zeroes the 24 other bits.

Another one :
pminub mm0,mm1
It does 8 (unsigned byte) min() operations on each byte of mmX registers.


I fear that for MMX/XMM emulations, your asm takes dozens of instructions :(

Regards
Posted on 2004-04-23 05:36:03 by valy
The heap is group discussions on nothing specific or your own subject. Everyone checks it from time to time. Give it a whirl.
Posted on 2004-04-23 09:39:48 by mrgone
What if I want to connect a synchronous device to yours?

Hmm okay maybe put some kind of sync that the async logic treats as a flag.

Pure async design is rarely found in digital design, because it's very difficult.

By the way I think I designed a theoretical ALU with a few similarities to your ISC, except that I have no registers at all (I control data paths through the ALU, selecting where each portion of the ALU connects to which subcomponent). It wouldn't be a full processor.

However how about this suggestion, that we separate the backend and frontend, and put all the backend into such a reconfigurable ALU? We have a set of registers, then we have a reconfigurable ALU? Each portion of the reconfigurable ALU has an "ENABLE IN" input which goes through a bunch of inverters for delay, with at least one delay further than the expected delay from that circuit? This "ENABLE OUT" output connects to the "ENABLE IN" of the next portion of the ALU (since the ALU is reconfigurable) which would then signal to the next stage to begin. When the ENABLE OUT signal finally reaches the last ALU portion, we know the ALU has completed and we can then load the destination register. Then the instruction simply defines the configuration of the ALU, and perhaps which of the registers the destination goes to.

The major problem would be the registers racing, which is the main problem with async design. The design becomes more and more like analog design (with registers becoming amplifiers with feedback) when you use async design, because the voltages cannot settle quite as decently as when we have well-defined starts and stops.
Posted on 2004-10-07 04:17:15 by AmkG
steve,
Just out of curiosity but how is your processor thing going?
Are u a millionare already? :)
Posted on 2004-12-14 17:59:37 by clippy