Please enable JavaScript.
Coggle requires JavaScript to display documents.
Reduced Instruction Set Computers - Coggle Diagram
Reduced Instruction Set Computers
Major Advances in Computers
The family concept
IBM System/360 1964
DEC PDP-8
Separates architecture from implementation
Microprogrammed control unit
Idea by Wilkes 1951
Produced by IBM S/360 1964
Cache memory
IBM S/360 model 85 1969
Solid State RAM
See memory notes
Microprocessors
Intel 4004 1971
Pipelining
Introduces parallelism into fetch execute cycle
Multiple processors
Reduced instruction set computer (RISC) architecture
Reduced Instruction Set Computer
A large number of general-purpose registers, and/or the use of compiler technology to optimize register usage
A limited and simple instruction set
An emphasis on optimizing the instruction pipeline
Instruction Execution Characteristics
Operations performed
: These determine the functions to be performed by the processor and its interaction with memory
Operands used
The types of operands and the frequency of their use determine the memory organization for storing them and the addressing modes for accessing them.
Execution sequencing
This determines the control and pipeline organization.
Operations
Assignments : movement of data
Conditional statements (IF, LOOP) : sequence control
Procedure call-return is the most time
consuming
Some HLL instruction lead to many machine code
operations
Operands
Mainly local scalar variables
Optimisation should concentrate on accessing
local variables
Implications
Best support is given by optimizing most used
and most time consuming features
Large number of registers
Careful design of pipelines
Simplified (reduced) instruction set
Procedure Calls
Very time consuming
Most programs do not do a lot of calls followed by lots of returns
Most variables are local
Large Register File
Register is the fastest memory element – closest
to or even part of the CPU
Software solution
Require compiler to allocate registers
Allocate based on most used variables in a given time
Requires sophisticated program analysis
Hardware solution
Have more registers
Thus more variables will be in registers
Register Windows
Function calls only uses few parameters
Limited range of depth of call
Use multiple small sets of registers
Alters between function calls
Calls switch to a different set of registers
Returns switch back to a previously used set of registers
Global Variables
Allocated by the compiler to memory
So that every function can access it
Inefficient for frequently accessed variables
Have a set of registers for global variables
Compiler Based Register Optimization
Assume small number of registers (16-32)
Optimizing use is up to compiler
HLL programs have no explicit references to registers
For each program, list possible variable
candidates which can be register
Assign these candidates into a symbolic / virtual
register
Map (unlimited) symbolic registers to real
registers
Symbolic registers that do not overlap can share
real registers
If you run out of real registers some variables
use memory
Optimization task ➔ determine which variable
can be assigned to real register
Reduced Instruction Set Architecture
Why CISC?
Compiler simplification
Smaller programs
Faster programs
It is far from clear that CISC is the appropriate
solution
But we can’t say for clear that RISC is far better
than CISC
RISC Characteristics
One instruction per cycle
Register to register operations
Few, simple addressing modes
Few, simple instruction formats
Hardwired design (no microcode)
Fixed instruction format
More compile time/effort
RISC v CISC
Not clear cut
e.g. PowerPC and Pentium II
Many designs borrow
from both philosophies
RISC Pipelining
Most instructions are register to register
Two phases of execution
I: Instruction fetch
E: Execute
For load and store
I: Instruction fetch
E: Execute
D: Memory
Optimization of Pipelining
Delayed Load
Register to be target is locked by processor
Continue execution of instruction stream until register required
Start the load, but processor idle until load complete
Re-arranging instructions can allow useful work whilst loading
Again, compiler dependent
Loop Unrolling
Iterate loop fewer times
Reduces loop overhead
Increases instruction parallelism
Improved register, data cache or TLB locality
Replicate body of loop a number of times