if a computer has a single CPU (with only one core), consecutive instructions are fetched into the pipeline.

Given that CPU is decoding one instruction and suddenly an interrupt happens. Does CPU complete the instruction cycle of the present instruction before loading the interrupt service routine instructions to the pipeline? or it populates pipeline at the same time of completing the execution of the present instruction?

Second question is that when new instructions are transferred into the pipeline what happens for the previous content of the pipeline? are they transferred to another cache?

   
Posted on 2010-05-18 01:51:30 by logicman112

The interrupt line is only polled by the cpu at the start of the next clock cycle - that is to say, pretty much every chip on the motherboard is clocked at once - all the inputs to all those chips are all read when the clock signals them to do so, most chips have a !LOAD or a !CENABLE pin that is attached to the clock signal line (regardless of whether the clock signal is internal to the cpu or not). In fact, it usually only takes a 'rising edge' of a clock signal to trigger an update/increment.
So - basically - the interrupt cannot be triggered until the clock signal goes from low to high, which won't happen until the end of the current clock cycle.

When the interrupt occurs, the cpu will cache the current value of EIP, and load a new EIP value from the ZeroPage of memory, where the interrupt table resides - these are pointers to , usually, functions provided by the bios.

The execution will continue at the interrupt handler's code (that pointer in the table) until it complese, with IRET.
Then the old EIP is restored by the cpu (as if we called a proc) and we're back to normal.

Posted on 2010-05-18 03:45:40 by Homer
logicman112, almost all your questions from your last few threads are thoroughly explained in the "Intel 64 and IA32 Software Developer's Manual". I strongly recommend that you read this book.

In this particular question you mistake the CPU itself with CPU core. CPU consists of MANY logical, pretty much autonomous, units. As Homer said, clocks dictate what can and what can't be done at any given moment. All Intel/AMD CPUs' clocks are explained in their respective manuals. As for the core: How a core operates (and, actually, how ANY part of a CPU operates) is strictly defined by that CPU's architecture (in other words: it varies from one architecture to another). Read the manuals and you will know what/how and where is done in pretty much all major x86-based CPUs ^^
Posted on 2010-05-18 15:35:57 by ti_mo_n
Thanks, ti_mo_n,  for the reply. My purpose is to discuss Intel documents more and to have a clear understanding about computer CPU and other chips to write strong and efficient assembly programs.

A drawback of Intel manuals is that it talks about some things which does not explain them in that place and it is OK because it is the nature of references. One way to overcome this is to keep questions in head and continue reading till answered. The result is populating brain with lots of questions and they are answered very late so that most of the time the person forgets them and even becomes reluctant having the answer.

It is contrary to mathematics books, when the book proves some theorems and then concludes others based on the proved ones. The bad thing about engineering texts is that it talks about many components while it has not defined their attributes first and forwards the reader to other books while those books also forward the reader to other books and sometimes the last book returns one to the first one!!

logicman112, almost all your questions from your last few threads are thoroughly explained in the "Intel 64 and IA32 Software Developer's Manual"

All Intel/AMD CPUs' clocks are explained in their respective manuals.


As far as i know there are not any timing diagrams, CPU clocks or talking thoroughly about micro-architecture at the Intel 64 and IA32 Software Developer's Manual, nor even at the data sheets of processor.
They only give a logical software view which is not a complete picture of what is happening and does effect on assembly programmer's work negatively.  Imagine we have a single task OS and i am running an application, should i use LOCK prefix for an instruction while I am sure that next instructions do not reference memory? Is it possible an interrupt happens and references the memory? Does the processor finishes the present instruction cycle before dealing with interrupt?(I guess the answer is yes--so no need to LOCK here)

In this particular question you mistake the CPU itself with CPU core.


I did knew about hyper-threading(logical CPUs) and multi-core and multi -processor technologies. My question was about the general policy of Intel micro-architectures with only one CPU and the question was when interrupt comes whether CPU finishes the current instruction cycle or it tries to decode and execute it along with instructions of interrupt service routine.

I asked a fine question which its answer can not be found easily by reading manuals in my opinion. Besides these types of questions help clarify the exact function of Intel processors and help understanding assembly language better finally and does not waste people's time I guess. It is a kind of practice for responders and helps low level programming.



     
Posted on 2010-05-19 00:17:45 by logicman112
You are still mistaking a CPU (complete package) with its core (one subcomponent). A core fetches instructions from L1 instruction cache, decodes them, and retires (reading from/writing to L1 data cache in the meantime and also changing values in its registers). It has some internal registers (general purpose registers, temporary registers, etc). And that's pretty much it. The core doesn't know what an interrupt is. All it knows is how to read an instruction pointed by its RIP/EIP/IP from L1 instruction cache, read data from L1 data cache, write data to L1 data cache. Everything else it the job assigned to other components. L1 Caches know how to read data from L2 cache and write data back to L2 cache. And they can copy any data from L2 cache (this is what is actually called 'caching'). L2 cache can order reads/writes from/to logical addresses. MMU assigns addresses, actually performs reads/writes, etc. Every component has its role and they are all independent, yet synchronized. "Synchronized" means that they have precisely defined states in which they do precisely defined work and expect precisely defined data.
The unit responsible for receiving, decoding, queing, signaling, prioritizing, sending and synchronizing any and all interrupts is called LAPIC. It IS explained how and when it does its job. I can even remember it saying which instructions an interrupt can be triggered between and which instructions it can't. Timing diagrams are necessary only if you want to match 2 devices so they can communicate. Once matched and proper protocols are established, the only thing you should be concerned about are their STATEs (usually symbolized by device's registers, especially "flags" of any sort).
To sum things up:
- CPU's core is happily executing instructions.
- I/O APIC (or some external LAPIC, possibly from another CPU) sends an interrupt packet to our CPU.
- Our CPU's LAPIC processes it accordingly and signals a flag (don't remember now which one it was - possibly the "I" flag and a few others). The core doesn't care right now - it's still happily executing instructions.
- As soon as the CORE is in precisely defined state , its execution is halted, intruction queue flushed, and its stack poiner, instruction pointer and a few more registers assigned new, previously defined (usually by OS'es interrupt handling code) values. In other words: CPU's core performs a far jump to a previously defined interrupt handling routine.
-at ANY time a second (or even more) interrupt may be received by the CPU's LAPIC. What happens then is explained in Intel's/AMD's respective manuals.

No need for any timings - you just read what states are available for a given subcomponent, how a given subcomponent enters/leaves these states and what can and can't be done in these states.

And on the lowest level, as Homer said, everything is synchronized by CPU's internal clocks, usually derived from FSB clock.

I hope this clears more than it shrouds ^^' Feel free to ask any more specific questions!
Posted on 2010-05-19 14:49:04 by ti_mo_n
Well done ti_mo_n and thank you for your good explanation.

ti_mo_n: You are still mistaking a CPU (complete package) with its core (one subcomponent).


How did you find it? My information about micro-architecture is not complete or dependable. I only know that each CPU and physical package has one or several pipeline stages and the instructions are decoding and executing in these pipelines.

You talked about some states and when CORE has a defined state , other logic halts it and new instructions may be entered the pipeline. I want to know how is this state? It seems that when execution cycle of an Instruction has started , it will continue to the end except some FPU floating point instructions.

ti_mo_n: As soon as the CORE is in precisely defined state... , its execution is halted, intruction queue flushed,...


Instruction queue is flushed or the instructions are transferred to L2 cache or some other cache?


Posted on 2010-05-22 00:45:17 by logicman112

How did you find it? My information about micro-architecture is not complete or dependable. I only know that each CPU and physical package has one or several pipeline stages and the instructions are decoding and executing in these pipelines.

It's in Intel's manuals. They call it something like "execution engine" IIRC.

You talked about some states and when CORE has a defined state , other logic halts it and new instructions may be entered the pipeline. I want to know how is this state? It seems that when execution cycle of an Instruction has started , it will continue to the end except some FPU floating point instructions.

Modern CPUs perform incredible optimizations. Trying to know what EXACTLY a CPU is doing at a moment is not only difficult but also unnecesary. There are a few golden rules of writing good assembly code and that's pretty much you can remember "generally". If you want to know if a specific piece of code is faster or not, you have to test it. The complexity of modern CPUs is so high that's it's very difficult and sometimes impossible (and always impractical) to try to "guess" whether something is better or worse. You just just write the code and see how it performs.
All the above sums to this: Some cores may halt on this, other cores may halt on that. Some architectures may halt at everything, others may halt at nothing. And even a newer revision of the same CPU may behave differently. In the times of 486 you could spend time trying to learn what exactly is it doing at precisely defined times. Today it's difficult and pointless.

Instruction queue is flushed or the instructions are transferred to L2 cache or some other cache?

They are flushed form the execution queue. I was talking about the instructions fetched and decoded in advance. Modern cores fetch instructions in advance (while executing the current one), predict any branches and then fetch even more instructions in advance. If a branch prediction fails, it has to flush the decoded instructions and fetch new ones. This is very costly. That's why one one the golden rules of good coding says "avoid any unnecesary branches". The CMOV instruction is of big help here. And, of course, you should always try to process the data in groups with SSE inructions.
Posted on 2010-05-22 06:33:47 by ti_mo_n