Chapter 12 - Processor Structure and Function

Chapter 12
Processor Structure and Function
CPU Structure
 Processor main functions:
Fetch instructions
: The processor reads an instruction from memory (register, cache , or
main memory).
Interpret instructions
: Instruction are decoded to determine what action is required.
Fetch data
: While an instruction is executed, data may need to be read from
memory or an I/O module.
Process data
: An instruction may require to perform some arithmetic or logical
operation with the data.
Write data
: Data resulting from the execution of an instruction may need to be
stored in memory .
CPU With Systems Bus
CPU Internal Structure
 ALU does the actual
computation or processing
of data.
 Control Unit controls the
movements of data and
instructions into and out of
the processor and control
operation of the ALU.
 The internal processor bus
is needed to transfer data
between the various
registers and the ALU;
because the ALU in fact
only operates on data in the
internal processor memory.
Register Organizations
Computer systems employs a memory hierarchy. At higher
level of the hierarchy memory is faster, smaller and more
expensive per bit. Within the processor there is a set of registers
that function as a level of memory above main memory and
cache in the hierarchy.
Registers in the processor perform two roles:
User visible-registers: Enables the machine or assembly
language programmers to minimize main memory references by
optimizing use of registers.
 Control and Status registers: Used by the control unit to control
the operation of the processor and by priviledge operating
systems programs to control the execution of programs.
User Visible Registers
 General Purpose
 Data
 Address
 Condition Codes
General Purpose Registers
 Can be assigned to a variety of functions by the
May be true general purpose
May be restricted to an specific function.
May be used for data or addressing
 Addressing
 Segment Pointers
Data Registers
The simplest type of registers are data registers,
which are used for the temporary storage of data. In
its simplest form, it consists of a set of D flip flops,
all sharing a common clock. All of the digits in the N
bit data word are connected to the data register by an
N line ``data bus''. Data registers may be used only
to hold data and can not be used for the calculation
of an operand address.
Address Registers
May be themselves general-purpose registers, or may
be devoted to a particular addressing mode.
Example of address registers:
Segment pointers: In machines with segmented
addresses, it holds the address of the base of the
Index registers: They are used for indexed
addressing and may be auto indexed.
Stack pointers: It points to the top of the stack.
This allow implicit addressing
Condition Code Registers
Is the least partially visible to the user. They are bit
set by the processor hardware as a result of
operations. CCR bits are collected into one or more
 Sets of individual bits
e.g. result of last operation was zero
 Can be read (implicitly) by programs
 e.g. Jump if zero
 Can not (usually) be set by programs
Design issues
Use of completely general-purposed register vs specialized
use. General purpose register increase flexibility and
program options. In addition, it increase instruction size &
complexity. Specialized registers are design to execute
smaller instructions making then faster. However, there is
less flexibility .
2. Number of register to be used must be between 8 and 32
registers. Fewer = more memory references. The use of
more registers may not reduce significantly the # of
memory references and takes up processor real estate.
3. Use of Reduction Instructions Set Computers (RISC). A
new approach that fallows the do less for best
performance idea (more registers) vs Complex
Instructions set Computers (CISC), which have long and
complex instructions to perform several actions (less
Design issues
Data registers should be:
 Large enough to hold full address.
 Large enough to hold full word.
 Flexible to combine two data registers
C programming
double int a;
long int a;
Design Issues (CCR )
Since conditions codes are set by
normal arithmetic and data
movements instructions, they should
reduce the numbers of COMPARE and
TEST needed.
Conditional instructions such as
BRANCH are simplified relative to
composite instructions such as TEST
Conditions codes facilitate multi-way
branches. For example, a TEST
instruction can be fallowed by two
BRANCHES, one less than or equal to
Zero and one greater than Zero
Conditions codes add complexity , both
to the hardware and software.
Conditions code bits are often modified
in different way by different
instructions, making life more difficult
for the microprogrammer and compiler
Conditions codes are irregular, they are
typically not part of the main data path,
so they require extra hardware
Often conditions codes machines must
add special non-condition-codes
instructions for special situations, such
as bit checking, loop control , and
atomic semaphore operation.
In a pipeline implementation,
condition codes required special
synchronization to avoid conflicts.
Control & Status Registers
There are a variety of processor registers that are
employed to control the operation of the processor.
Some of them may be visible to machine instructions
executed in a control or operation system code
Each machine will have different register
organization and use different terminology.
Control & Status Registers
 Program Counter:
Contains the address of an instruction to be fetched.
 Instruction Decoding Register:
Contains the instruction most recently fetched.
 Memory Address Register:
Contains the address of a memory location.
 Memory Buffer Register:
Contains a word of data to be written to memory or
the word most recently read.
Program Status Word
Many processors designs include a register or a
set of registers, often known as Program Status Word
(PSW), that contains status information of operation
executed by the Arithmetic Logic Unit
Program Status Word (2)
Some common field or flags include the following:
 Zero: Set when the result is 0.
 Sign: Contains the sign bit of the result of the last arithmetic
Carry: Set if an operation resulted in a carry or borrow. Used for
multiword arithmetic operations.
Equal: Set if a logical compare result is equality.
Overflow: Used to indicate arithmetic overflow.
Interrupt enable/disable: Used to enable or disable interrupts.
Supervisor: Indicates whether the processor is executing in
supervisor or user mode.
Supervisor Mode
 Protection ring zero
 Also known as Kernel mode
 Allows privileged instructions to execute
 Used by operating system
 Not available to user programs
Computer operating systems
provide different levels of
access to resources. A
protection ring is one of two or
more hierarchical levels or
layers of privilege within the
architecture of a computer
Example Register Organizations
is special purpose.
• 4•MC68000
addressable in a
9 address registers.
programs written on earlier machines. to
by theregister
to theoriginal
of the current
• 32-bit PC and a 16-bit SR
in the new
of 1-bit
status and
control flags
Instruction Cycle
 In general has the following subcycles as we know it.
 Fetch: Read the next instruction from memory into the
 Execute: Interpret the opcode and perform the indicated
 Interrupt: If interrupts are enabled and an interrupt has
occurred, save the current process state and service the
Indirect Cycle
 Can be thought of as additional instruction subcycle
 May require memory access to fetch operands
 Indirect addressing requires more memory accesses
Instruction Cycle with Indirect
Instruction Cycle State Diagram
Data Flow (Instruction Fetch)
 Depends on CPU design
 In general:
 Fetch
 PC contains address of next instruction
 Address moved to MAR
 Address placed on address bus
 Control unit requests memory read
 Result placed on data bus, copied to MBR, then to IR
 Meanwhile PC incremented by 1
Data Flow (Fetch Diagram)
Data Flow (Data Fetch)
 After the Fetch cycle is completed
 IR is examined
 If indirect addressing, indirect cycle is performed
 Right most N bits of MBR transferred to MAR
 Control unit requests memory read
 Result (address of operand) moved to MBR
Instruction Cycle with Indirect
Data Flow (Indirect Diagram)
Data Flow (Execute)
 May take many forms
 Depends on instruction being executed
 May include
 Memory read/write
 Input/Output
 Register transfers
 ALU operations
Instruction Cycle with Indirect
Data Flow (Interrupt)
 Current PC saved to allow resumption after interrupt
 Contents of PC copied to MBR
 Special memory location (e.g. stack pointer) loaded
to MAR
 MBR written to memory
 PC loaded with address of interrupt handling routine
 Next instruction (first of interrupt handler) can be
Data Flow (Interrupt Diagram)
Instruction Pipelining description
• The Idea
Similar to an assembly line.
New input is in before the last
one ends.
Each stage of pipeline does
something the each instruction
without affecting the other.
Works using buffers in each
instruction stage
The CPU works on multiple
instructions at the same time.
Two Stage Instruction Pipeline
Timing Diagram for six stage
Instruction Pipeline Operation
 Instruction Process
Fetch instruction(FI)
Decode instruction(DI)
Calculate operands (CO)
Fetch operands(FO)
Execute instructions(EI)
Write result(WR)
Performance of multi-stage pipeline
 Introduces a level of parallelism in instruction
 Increases efficiency of CPU
 Increases overall speed But not doubled
Increase speed of 9 instruction from 54 time units to 14.
Fetch usually shorter than execution
Any jump or branch means that prefetched instructions are
not the required instructions
Speedup Factors with Instruction Pipelining
Set backs…..
 Overhead of buffer to buffer transfers and
preparation operations.
 Amount of logic require to handle additional stages
and memory increases exponentionally
 Interrupts handling and Branch instructions.
The Effect of a Conditional Branch on Instruction
Pipeline Operation
Branching/no branching comparison
Dealing With
• Multiple Streams
• Prefetch Branch
• Loop buffer
• Branch prediction
• Delayed branching
Multiple Streams
 Have two pipelines
 Prefetch each branch into a separate pipeline
 Use appropriate pipeline
 Leads to bus & register contention
 Multiple branches lead to further pipelines being
Prefetch Branch Target
 Target of branch is prefetched in addition to
instructions following branch
 Keep target until branch is executed
 Used by IBM 360/91
Loop Buffer
• Very fast memory
• Maintained by fetch stage of
Check buffer before fetching
from memory
Very good for small loops or
c.f. cache
Used by CRAY-1
Branch Prediction (1)
 Predict never taken
Assume that jump will not happen
Always fetch next instruction
68020 & VAX 11/780
VAX will not prefetch after branch if a page fault would result
(O/S v CPU design)
 Predict always taken
Assume that jump will happen
Always fetch target instruction
Branch Prediction State Diagram
Branch Prediction (2)
 Predict by Opcode
 Some instructions are more
likely to result in a jump
than thers
 Can get up to 75% success
 Taken/Not taken switch
 Based on previous history
 Good for loops
Branch Prediction (3)
 Delayed Branch
 Do not take jump until
you have to
 Rearrange instructions
Foreground Reading
 Processor examples
 Stallings Chapter 12
 Manufacturer web sites & specs
Review Questions
 What general roles are performed by CPU registers?
User –visible registers and control and status registers.
 What categories of data are commonly supported by user-visible registers?
Address, condition codes and general data.
 What is the function of condition codes?
They are used in conditional branch operation to determine which branch to take.
 What is a program status word?
Register or set of registers that contain condition codes and other status information.
 Why is a two- stage instruction pipeline unlikely to cut the instruction cycle time by
half, compared with the use of no pipeline?
Because execution cycle is longer so the fetch buffer will have to wait until the next cycle to be empty
 List and briefly explain various ways in which an instruction pipeline can deal with
conditional branch instructions?
Multiple streams, pre-fetch branch target, loop buffer, branch prediction, delayed branch.
 How are history bites used for branch prediction?
History bytes are used to record the history information about a branch instruction, such as its address
and what was the result of the last time it was taken. This information is used to predict the possibility of
this branch being taken again.

similar documents