The Microprocessor
From ASM Book
The microprocessor is the most central part of a computer. Everything a computer can do is determined by the capabilities of the microprocessor inside it. A microprocessor, generally, reads data from memory, works on it, and writes the result back to memory. It also performs many additional operations including arithmetic, logic, and input-output.
Microprocessors have quite a history. The invention of the integrated circuit (IC), the containment of an entire CPU on a single chip, and the introduction of the IBM PC are events of particularly great importance from the historical viewpoint. Building an entire CPU on a single chip for the first time ever was a great achievement for Intel Corporation (founded by Dr. Robert Noyce and Gordon Moore) and also one of the main reasons why this CPU-on-a-chip was being trade-named the microprocessor.
A microprocessor is manufactured by placing extremely tiny transistors on extremely small semiconductor integrated circuits. Older CPUs were made of vacuum tubes and also of separate transistors, which resulted in large sizes of computers. Chips used in more recent microprocessors are silicon dies of incredibly small size on the order of 10-9 m. Your microprocessor comes to be made from beach sand!
Intel Corporation has had a major hand in the development of the microprocessor, and its efforts should be well applauded. The name "Intel" derives from Integrated Electronics just in case you wanted to know. However, Intel is not the only company manufacturing microprocessors; there are other competent companies like AMD (American Micro Devices), Motorola, and Cyrix that also manufacture microprocessors. Some of these companies also roll out Intel-compatible microprocessors.
There are many families and generations of microprocessors, but we are going to study only those from the 80x86 family. When we use the term "80x86," it refers to both Intel-manufactured and Intel-compatible 3rd-party manufactured microprocessors. Details about any particular company otherwise will be specifically noted.
Contents |
Basic Architecture
The architecture that the 80x86 microprocessor-based computers use is based on a fundamental architecture first proposed by Dr. John von Neumann in 1946. This basic architecture is, therefore, known as the von Neumann architecture. Although based on this architecture, the 80x86 microprocessor architectures are highly enhanced over it. As a result of technological innovations and clever marketing, the Intel 80x86 architecture has become the de facto industry standard.
The von Neumann Machine
A von Neumann machine is a stored-program computer that uses a single store for both data and executable instructions. This store in our computers is mostly semiconductor-based memory. A von Neumann machine has 5 parts: arithmetic-logic unit (ALU), control unit (CU), memory, input-output, and a bus. The ALU, CU and the bus are generally considered to form the CPU. Since von Neumann computers spend a lot of time moving data between memory and the CPU (slowing down processing considerably), the bus is usually replaced by a bus unit (made of multiple separate busses).
- Many computers even today are based on this architecture, but several have additional enhancements made to them. Digital computers based on the von Neumann architecture loosely follow this pattern of operation
- Fetch instruction at current code location (pointed by the program counter).
- Update current code location (add length of the fetched instruction to program counter).
- Calculate addresses, if any.
- Fetch any operands.
- Perform the requested operation.
- Store any results.
- Go back to step 1, to execute the next instruction.
Inside the Microprocessor
For reasons that will soon become clearer, we begin our discussion of the workings of a microprocessor by highlighting a simple analogy between us and microprocessors.
A simple analogy
For a rough analogy, consider yourself and compare your brain with the microprocessor. When you were a toddler, a symbol such as µ wouldn't have made had much sense to you except for that it was a picture. As you grew up to become a kindergarten kid, you started identifying things, and learning the alphabet and the digits. Pictures started coming to life.
As you grew older, you came to know about how these individual picture symbols were grouped together to form words of various sizes and different meanings. Later on, you learnt about simple sentences, and then complex ones. You may also have used your index finger to point to words to easily locate them while slowly reading sentences. With age you began reading and comprehending entire paragraphs, and all this while you only got quicker and quicker at doing it.
A microprocessor works in a similar manner. It contains a bus interface unit that enables it to communicate with external devices and an execution unit that executes the instructions fed to it.
Bus Interface Unit
The microprocessor has a part called the bus interface unit (BIU), which establishes the communication link between the microprocessor and external devices. "Bus" is a general computer term for a pathway consisting of a number of electronic signal lines through which data and signals are transferred. The number of path-lines in a bus determines its size. Each signal-line can carry only one of two voltage values (high or low) at a time, thus signaling either a logic-1 state or a logic-0 state (a sort of yes or no) to the microprocessor. The BIU is primarily made up of three busses: an address bus, a data bus, and a control bus.
Address Bus
The address bus is much like your index finger, which you can use to locate words while reading them. It tells the microprocessor where to fetch data from or where to send it to. The location can either be in memory or it can be an input/output port (connecting to an external device). The address bus in an 8086 microprocessor has 20 signal-lines, and therefore, can only hold an address of size 20 bits. Since each bit can have only one of two possible states and a group of n bits can have only 2n total possible states, the number of different locations that the 8086 can address is 220 = 1,048,576 or 1 M (one Meg). The Pentium, on the other hand, has a 32-bit address bus, and can address up to 232 = 4096 M = 4 Gig locations.
Data Bus
The data bus is responsible for getting data into and sending it outside the microprocessor. The size of the data bus decides how much data can be transferred through it at a time. The microprocessor has two types of data bus: an internal data bus and an external system data bus. The system data bus of the microprocessor communicates with the external devices and transfers data to the internal bus. The internal bus transfers data to and fro between the ALU, the registers, and the instruction decoder. A 16-bit microprocessor has an internal data bus width of 16 bits, a 32-bit microprocessor internal bus has a width of 32 bits, and a 64-bit microprocessor internal bus has a width of 64 bits.
Control Bus
This particular bus is the one that the Bus Interface Unit uses to notify the memory of its intentions. For example, if the microprocessor wanted to write to memory a write line on the Control Bus would activate letting the memory know that the microprocessors intention is to write a value to main memory.
(ADD MORE-Read, Cache, etc.)
Execution Unit
To operate on data using instructions, a microprocessor contains an execution unit (EU). A square wave oscillator or clock circuit generates the timing signals based on which the processor synchronizes all its activities. It also determines the speed with which instructions are fetched and executed. The more the number of instructions executed per clock cycle, the faster the processor.
Binary instructions that a microprocessor understands as executable define its instruction set. You cannot feed just about anything to the microprocessor and tell it to execute it. 80x86 microprocessors are based on the CISC regulations and have large instruction sets. However, later microprocessors in this family are fully compatible with earlier ones, so that the newer instruction set overlays and extends the previous ones. This simply means that programs written for an 80386 will run comfortably on a Pentium-based computer, but may not on an 80286 one.
Registers
Within the execution unit are the registers we will use to program the x86 microprocessors.
Basic operation
The processor interface
The CPU (Central Processor Unit) is the part of the computer system that contains the logic for fetching instructions, deciphering them, and executing them. Attached to the CPU are storage for code and data, also known as memory. Coprocessors, and other units known as peripherals can also be attached to the CPU. (Although, memory acts as a store for code and data, there are clear distinctions between storage and memory in computer hardware terminology). Data is transferred between these units via data paths. The number and configuration of units and data paths vary depending on who designs the system.
Pentium-class processors provide one data path for data transfers between the processor chip and all other system units. Off-chip units include the memory shared by data and executable code, and various device controllers. The processor chip will read data from memory or a device, and write data to memory or a device. The processor chip also provides an address to select which device register or memory location to write to or read from.
A Pentium processor (chip) writes data by placing an address on the set of signal lines known as the address bus, and the data on the set of signal lines known as the data bus. The processor reads data by placing the address on the address bus, and capturing the data appearing on the data bus. Timing signals control the data transfer.
Memory management features
For computational purposes, the following memory management features are unnecessary. However, the use of these features explains why your program cannot easily alter or read the data of another program in multitasking systems such as Windows and Linux.
Protected mode and segmented memory (it ain't where you think it is)
Intel defined at least three operating modes for their 32-bit microprocessors: real, protected, and virtual-8086. We are primarily interested in protected mode because that is the mode our 32-bit programs in Windows and Linux operate under.
Under protected mode, we can define segments, blocks of contiguous memory that hold code and data. They are managed by segment descriptors. Two types of segments are defined: code and data. Segments are allowed to overlap.
Segment descriptors control read, write, and execute permissions. The following table shows all of the possible permission combinations. It shows that executable code must be in code segments, and writable data locations must be in data segments.
| Segment Type | Execute | Read | Write |
|---|---|---|---|
| code | Yes | - | - |
| code | Yes | Yes | - |
| data | - | Yes | - |
| data | - | Yes | Yes |
We can designate whether each segment is 16-bit or 32-bit. If a code segment is 32-bit, by default, instructions in it use 32-bit addressing and 32-bit operands (when instructions need more than one byte). If a code or data segment is 32-bit, the segment can be as large as 4G (allowing full 32-bit addressing).
Segment descriptors also hold base addresses that will be added to the effective addresses to get linear addresses.
To access segments, you use a value called a selector. The selector contains an index into the descriptor table where segment descriptors are stored. When you load a segment register (CS, DS, ES, SS, FS, GS) with a selector, the indexed descriptor is loaded into a hidden register (effectively a cache) associated with the segment register.
As MS-DOS assembler programmers know, every memory access uses a segment register, whether you specify a register or not. Thus, every memory access, code or data, is tested for permissions, and every memory access is modified by a base address.
Windows does not take much advantage of segment registers. When your program runs, the segments associated with the four primary segment registers CS, DS, ES, and SS are set to the same base address. An effective address will be converted to the same linear address regardless of whether you are modifying it with CS (jumps), DS (most data accesses), ES (some string instructions), or SS (stack instructions). Except for execute and write privileges, the four segments are effectively the same single segment. This is the flat memory model.
Paging and virtual memory (it still ain't where you think it is)
In protected mode, the memory paging feature can be enabled. When discussing this feature, a "page" is no longer a 256 byte block of memory.
When paging is enabled, a set of page tables are used to change the address again. This is the last possible alteration of the address before it goes out onto the address bus. The most recent Pentiums can generate 36-bit physical addresses with this feature.
Whereas the segmentation feature gathers memory into segments of varying sizes, the paging feature breaks up memory into pages of fixed size (4096 bytes on a Win32 platform). Part of a linear address is treated as a page number, which is used to index into page tables to retrieve a page base address. The page base address is added to the rest of the linear address to create a physical address. The page base addresses allow the pages to be randomly distributed throughout physical (true, real, actual) memory, without crashing the code in them!
Because software can update the page tables, we can make two programs occupy the "same memory" by making two sets of page tables. We use one set when executing one program, and we use the other set to execute the other program. This is why addresses in one program are normally invalid in another program -- the addresses map to different pages!
Each page table entry also has a "present" bit, indicating if the page is loaded with page data. This bit is maintained by software, which allows us to implement "page swapping", the heart of virtual memory.
When the processor attempts to access a "not present" page, it generates a page fault exception. The OS decides if the memory is allocated. If not, the OS signals a bad memory access. Otherwise, the OS finds a suitable place to reload the "swapped out" page. If there is no room, the OS chooses a page to "swap out" to the hard disk, and then replaces it with the desired page from the hard disk. Then the page is marked as "present".
Optimization note: The page table entry also has a dirty bit, which is set when a loaded page has been written to. A page that isn't dirty does not need to be swapped out, because a copy of the page already exists on the hard disk.
