very interesting article for us:
http://www.gotw.ca/publications/concurrency-ddj.htm
an interesting quote from it:
"Applications will increasingly need to be concurrent if they want to fully exploit continuing exponential CPU throughput gains
Efficiency and performance optimization will get more, not less, important"
[...]
"Those languages that already lend themselves to heavy optimization will find new life; those that don?t will need to find ways to compete and become more efficient and optimizable"
hooray! :)
http://www.gotw.ca/publications/concurrency-ddj.htm
an interesting quote from it:
"Applications will increasingly need to be concurrent if they want to fully exploit continuing exponential CPU throughput gains
Efficiency and performance optimization will get more, not less, important"
[...]
"Those languages that already lend themselves to heavy optimization will find new life; those that don?t will need to find ways to compete and become more efficient and optimizable"
hooray! :)
hooray! :)
Thing is, assembly doesn't lend itself to heavy optimization - it requires heavy optimization.
Interesting article, a little too over-dramatic if you ask me though.
The article seems to state the obvious, processor speeds are reaching their limits with current technology. So naturally the next objective should be to increase how much we can do with that speed.
The author touchs on some important factors, those being the increasing of clock speed, the decreasing of instruction cycle times, and the growth of the on-chip cache. He also touches on the future of processors being true multi-core processors that utilize shared resoures.
What is missing from this article is an in-depth look at what it would take to bridge the gap between the structure of current applications and applications that would better utilize multi-processing. The statement in the article, "Now, not all applications (or, more precisely, important operations of an application) are amenable to parallelization.", should be broken down further into practical application.
The reference that is made in that quote generally refers to processes or threads that have a unique role. This is in contrast to the old days of DOS and such where execution was expected to be single-thread minded. Today's operating systems and applications are already generally designed to utilize multi-threaded processing.
What the author should have done is simply state that with the introduction of new hyperthreading/multi-core processors, the program focuses in applications need to be refined and become more and more specific to a particular task. With that in mind, programmers will be able to modify their applications in whatever programming language they wish in order to utilize future multi-core based multi-threading.
The needed conversion is not as dire as the author proposes. The quick and sloppy type programmers will just have to start utilizing good programming practices, simple as that.
The one thing metioned in this article that I truely despise is the usage of reordering. If you change the order of execution at run-time, then the result of the program really does change dramatically. Reordering and restructuring should be the job of the assembler/compiler's optimization routines and not the processor.
The article seems to state the obvious, processor speeds are reaching their limits with current technology. So naturally the next objective should be to increase how much we can do with that speed.
The author touchs on some important factors, those being the increasing of clock speed, the decreasing of instruction cycle times, and the growth of the on-chip cache. He also touches on the future of processors being true multi-core processors that utilize shared resoures.
What is missing from this article is an in-depth look at what it would take to bridge the gap between the structure of current applications and applications that would better utilize multi-processing. The statement in the article, "Now, not all applications (or, more precisely, important operations of an application) are amenable to parallelization.", should be broken down further into practical application.
The reference that is made in that quote generally refers to processes or threads that have a unique role. This is in contrast to the old days of DOS and such where execution was expected to be single-thread minded. Today's operating systems and applications are already generally designed to utilize multi-threaded processing.
What the author should have done is simply state that with the introduction of new hyperthreading/multi-core processors, the program focuses in applications need to be refined and become more and more specific to a particular task. With that in mind, programmers will be able to modify their applications in whatever programming language they wish in order to utilize future multi-core based multi-threading.
The needed conversion is not as dire as the author proposes. The quick and sloppy type programmers will just have to start utilizing good programming practices, simple as that.
The one thing metioned in this article that I truely despise is the usage of reordering. If you change the order of execution at run-time, then the result of the program really does change dramatically. Reordering and restructuring should be the job of the assembler/compiler's optimization routines and not the processor.
Many processors have reordering of instructions on the fly, but this has the limitations of the instruction set - how much can be done without overlap of execution resources? There are limitations within the processor, but these have been refined based on the instruction set through simulations running current software. Within the software lies the limitations of the programmer/compiler. Intel has experience with designing compilers for diverse CPU's, and much reseach has been done in this direction. I'm assuming they have come to the conclusion that sufficient gains are possible through other types of granularity.
Reording can work on a higher level within the processor if separate threads are known to exist within the same memory space. Of course, ignoring any protection between the threads will gain the most speed. Maybe even design a lopsided processor where resources are not equal across cores - hopefully, the OS/programmer schedules the threads. Reordering will be where many gains are achieved in the multi-core areana, imho.
AMD has increase the opportunities for reorder optimization at the instruction level by adding more registers - basically exposing temporary registers within the reorder buffers. This has the effect of widening the base of the design. Looking forward this also increases the types of optimization availible at the core level. Rather than continue to deepen the instruction pipe I hope it gets wider and cross-core designed.
Before someone screams RISC - that is already taking place within the x86 designs - "pure" RISC CPU's seem to have just as many problems when it comes to general performance. Sure, they can execute very fast for specific tasks - like a GPU, but that is comparing apples to oranges. General code just branches too dynamically for deep instruction pipes with a bazillion execution units.
Programmers are already thinking and programming from the multi-threaded perspective, but the benefits of that are not being realized by JoeComputerUser. Not to mention the marketing benefits of multi-cores! People I talk with don't have a clue what is in their machine - let's get some more flashy names or numbers to throw around. :)
Reording can work on a higher level within the processor if separate threads are known to exist within the same memory space. Of course, ignoring any protection between the threads will gain the most speed. Maybe even design a lopsided processor where resources are not equal across cores - hopefully, the OS/programmer schedules the threads. Reordering will be where many gains are achieved in the multi-core areana, imho.
AMD has increase the opportunities for reorder optimization at the instruction level by adding more registers - basically exposing temporary registers within the reorder buffers. This has the effect of widening the base of the design. Looking forward this also increases the types of optimization availible at the core level. Rather than continue to deepen the instruction pipe I hope it gets wider and cross-core designed.
Before someone screams RISC - that is already taking place within the x86 designs - "pure" RISC CPU's seem to have just as many problems when it comes to general performance. Sure, they can execute very fast for specific tasks - like a GPU, but that is comparing apples to oranges. General code just branches too dynamically for deep instruction pipes with a bazillion execution units.
Programmers are already thinking and programming from the multi-threaded perspective, but the benefits of that are not being realized by JoeComputerUser. Not to mention the marketing benefits of multi-cores! People I talk with don't have a clue what is in their machine - let's get some more flashy names or numbers to throw around. :)
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2343&p=1