Architecture


- general purpose computer based on von Neumann architecture
- functional units: 
	- processor = CPU 
	- memory
	- I/O controllers and devices
	- interconnected by buses: address bus, data bus

- typical CPU contains:
	- ALU: function unit to perform arithmetic/logical operations
	- control unit: manages instruction cycle
		- instruction fetch (IF): retrieve instruction at PC into IR
		- decode (ID): determine what operations required of 
			ALU, registers, memory
		- get operands (OP): load data values from registers or memory
		- execute (EX): perform operation
		- write back (WB): if data requires store to register or memory
	- general purpose registers
		- return value from procedure
		- procedure parameters (several registers)
		- frame pointer (used for addressing items in stack)
		- stack pointer (used for additional procedure parameters)
		- return address from procedure
		- arithmetic registers (used by ALU)
	- control registers
		- program counter (PC)
		- instruction register (IR)
		- program status word (PSW = Z, C, ER, OVFL, SUP)
		- base (added to all addresses in user mode)
		- bound (address limit in user mode before addition of base)
	
Operand addressing modes: 
	- implicit		SUB B
	- register		MOV A,B
	- immediate		ADD 32h
	- register indirect	ADD (HL)

- variety of addressing modes 
	- increases programming flexibility
	- increases complexity of CPU control logic

RISC vs. CISC
	- example: DEC VAX 11/780 (CISC) = 256 instructions
		   DEC/MIPS 5000  (RISC) = 64 instructions
	- observation: specialized instructions of CISC rarely used
	- motivation for RISC:
		- more room for on-CPU registers
		- more room for on-CPU cache
		- exploit pipelining 
	- drawbacks:
		- larger programs (consisting of more instructions)
		- hence, more memory required
	- goal: instruction rate ~= clock rate

Overlapping fetch with execution: Pipelining
	- basic principle: subdivide work of executing an instruction
	  into K steps, each requiring time T
	- execute multiple instructions concurrently by alloting 
          each of the K steps to a different instruction at each time slice
	- example: K=5 stage pipeline

	| IF | ID | OP | EX | WB |

	- time to execute 1 instruction = KT
	- time to execute N instructions serially = NKT
	- time to execute N instructions in pipeline (assuming full pipeline) 	
		= KT + (N-1)T ~= NT for N>>K
	- scheme works fine provided instructions are sequential
	- difficulty arises with branches, particularly conditional branches

Processor modes
	- user mode: processor can only execute subset of instructions
	- supervisor mode: processor can execute any instruction
	- determined by supervisor (SU) bit of PSW
	- can't allow anybody to toggle (eg. disable interrupts indefinitely)
	- instead, bit is toggled by traps/interrupts that invoke OS routines

Memory organization
	- main bottleneck in program execution is memory access time
	- use high speed cache to store program text/data in high demand
	- need intelligent policies for keeping cache up to date
	- coordinate access to memory by different programs during execution
	 (memory usage for system/function calls)

	- memory interface coordinated by registers:
		- MAR: memory address register
		- MDR: memory data register
		- CMD: read/write

Layout of executing program in memory

/* in main.c */
int main (int argc, char *argv[]) {		----------- break
	int a = 1;				  DATA  |
	foo(a);					--------v--
	return 0;				  
}						--------^--
						  STACK |
						-----------
						   TEXT
						----------- 0
/* in foo.c */					
#include 				
void foo(int a) {				
	printf ("%d\n", a);
}

Interrupts
	- don't want program in CPU to keep checking busy/done flags
	- this is known as "busy-wait" -- wastes processor cycles
	- instead, use device interrupts to signal that operation is completed
	- hardware includes interrupt request vector (equiv to 'done' flags)
	- control unit checks IRQ during instruction cycle
	- if set, processor clears IRQ and begins executing interrupt handler
	- but what happens to program?
	- need to save values of program registers and PC
	- add the following control registers to our CPU:
		- interrupt support registers
			- IRQ (interrupt request; which interrupt?)
			- IPC, IPSW (stores value of PC, PSW before interrupt)
			- IVEC (address where interrupt vector table is stored)
	- potential problems:
		- two or more interrupts occur in same cycle
		- an interrupt occurs for B while interrupt handler A is running
	- solution: disable interrupts while interrupt handler is running