Hello friends.

I was reading this book lately, and the wize guy there said one
can group two asm instructions in such a way that they will be done simultaneously
by different conveyors or something, called U and V ( in pentium as far as i understood).

i would look forward to hearing more about this feature of pentium cpu.

especially about application of this in code. what are the rules to code that way? etc
Posted on 2010-03-22 11:05:47 by Turnip
Avoid reading from a register/memory location immediately after writing to it.

This is it in a nutshell :P
Posted on 2010-03-22 11:15:42 by ti_mo_n
Some info here:

http://www.asdf.org/~fatphil/x86/pentopt/10.html

Pretty much obsolete, I think...

Best,
Frank

Posted on 2010-03-22 11:38:01 by fbkotler
Thanks timon & fbkotler. :P


why obsolete?
Posted on 2010-03-22 12:10:28 by Turnip

why obsolete?


Since the original Pentium, there has been the introduction of things such as instruction re-ordering, SSE, hyperthreading and multiple cores. I would be willing to bet that there's more than just "two pipelines" these days.
Posted on 2010-03-22 12:29:07 by SpooK
Correct. The Pentium was the first x86 to have a superscalar architecture, which means that it had two parallel instruction pipelines, named U and V, capable of executing two instructions in parallel.
This is very specific to the Pentium architecture. The architecture was in-order, which means that the instructions were executed exactly as they were encoded. In other words, the assembly programmer or compiler was responsible for grouping instructions so that they could be executed in U and V in parallel (there were various rules, because you cannot execute an instruction in V if it depends on the result of the instruction in U, and V did not support all instructions, etc).
In practice this meant that most code wasn't all that efficient.

The Pentium Pro was the first out-of-order architecture, which meant that it would pre-decode a block of code, and then it would attempt to re-order the code to make the best possible use of its superscalar architecture. This means that you no longer rely on the assembly programmer or compiler for the actual scheduling of instructions.
Instead of pipelines, Pentium Pro speaks of 'execution ports'. It's slightly different. There's a number of different ports, each one capable of processing a subset of instructions. A set of 'mini pipelines' which together form 'the pipeline'.

Both AMD and Intel CPUs today are still very close to the original Pentium Pro.
This means that an assembly programmer or compiler mainly has to worry about optimizing the code for the decoder (and picking the fastest instructions for the job, not legacy ones). Instruction scheduling and parallelism are extracted by the CPU on-the-fly, through its out-of-order mechanism.

Having said that, it's not *entirely* obsolete, since Intel's Atom processor is an in-order architecture, just like the Pentium. I haven't studied the architecture in detail, but I wouldn't be surprised if it works very similarly to the U and V pipes in the Pentium.
Posted on 2010-03-22 13:33:02 by Scali
thanks for explaining, guys.
Posted on 2010-03-28 04:57:38 by Turnip