Architecture
- general purpose computer based on von Neumann architecture
- functional units:
- processor = CPU
- memory
- I/O controllers and devices
- interconnected by buses: address bus, data bus
- typical CPU contains:
- ALU: function unit to perform arithmetic/logical operations
- control unit: manages instruction cycle
- instruction fetch (IF): retrieve instruction at PC into IR
- decode (ID): determine what operations required of
ALU, registers, memory
- get operands (OP): load data values from registers or memory
- execute (EX): perform operation
- write back (WB): if data requires store to register or memory
- general purpose registers
- return value from procedure
- procedure parameters (several registers)
- frame pointer (used for addressing items in stack)
- stack pointer (used for additional procedure parameters)
- return address from procedure
- arithmetic registers (used by ALU)
- control registers
- program counter (PC)
- instruction register (IR)
- program status word (PSW = Z, C, ER, OVFL, SUP)
- base (added to all addresses in user mode)
- bound (address limit in user mode before addition of base)
Operand addressing modes:
- implicit SUB B
- register MOV A,B
- immediate ADD 32h
- register indirect ADD (HL)
- variety of addressing modes
- increases programming flexibility
- increases complexity of CPU control logic
RISC vs. CISC
- example: DEC VAX 11/780 (CISC) = 256 instructions
DEC/MIPS 5000 (RISC) = 64 instructions
- observation: specialized instructions of CISC rarely used
- motivation for RISC:
- more room for on-CPU registers
- more room for on-CPU cache
- exploit pipelining
- drawbacks:
- larger programs (consisting of more instructions)
- hence, more memory required
- goal: instruction rate ~= clock rate
Overlapping fetch with execution: Pipelining
- basic principle: subdivide work of executing an instruction
into K steps, each requiring time T
- execute multiple instructions concurrently by alloting
each of the K steps to a different instruction at each time slice
- example: K=5 stage pipeline
| IF | ID | OP | EX | WB |
- time to execute 1 instruction = KT
- time to execute N instructions serially = NKT
- time to execute N instructions in pipeline (assuming full pipeline)
= KT + (N-1)T ~= NT for N>>K
- scheme works fine provided instructions are sequential
- difficulty arises with branches, particularly conditional branches
Processor modes
- user mode: processor can only execute subset of instructions
- supervisor mode: processor can execute any instruction
- determined by supervisor (SU) bit of PSW
- can't allow anybody to toggle (eg. disable interrupts indefinitely)
- instead, bit is toggled by traps/interrupts that invoke OS routines
Memory organization
- main bottleneck in program execution is memory access time
- use high speed cache to store program text/data in high demand
- need intelligent policies for keeping cache up to date
- coordinate access to memory by different programs during execution
(memory usage for system/function calls)
- memory interface coordinated by registers:
- MAR: memory address register
- MDR: memory data register
- CMD: read/write
Layout of executing program in memory
/* in main.c */
int main (int argc, char *argv[]) { ----------- break
int a = 1; DATA |
foo(a); --------v--
return 0;
} --------^--
STACK |
-----------
TEXT
----------- 0
/* in foo.c */
#include
void foo(int a) {
printf ("%d\n", a);
}
Interrupts
- don't want program in CPU to keep checking busy/done flags
- this is known as "busy-wait" -- wastes processor cycles
- instead, use device interrupts to signal that operation is completed
- hardware includes interrupt request vector (equiv to 'done' flags)
- control unit checks IRQ during instruction cycle
- if set, processor clears IRQ and begins executing interrupt handler
- but what happens to program?
- need to save values of program registers and PC
- add the following control registers to our CPU:
- interrupt support registers
- IRQ (interrupt request; which interrupt?)
- IPC, IPSW (stores value of PC, PSW before interrupt)
- IVEC (address where interrupt vector table is stored)
- potential problems:
- two or more interrupts occur in same cycle
- an interrupt occurs for B while interrupt handler A is running
- solution: disable interrupts while interrupt handler is running