Principles of Caching and Virtual Memory

Definitions:
- block: minimal amount of data that can be present in two consecutive
  levels; usually fixed size
- hit, miss: memory access finds/doesn't find the desired data in upper level
- hit rate: relative frequency of hits (# hits / memory accesses)
- miss rate: (1 - hit rate)
- miss penalty: time to replace an upper level block with a corresponding
  low level one + time to deliver block to requesting device (e.g. CPU)
  calculated as: miss penalty = access time + transfer time


How it works


- primary memory partitioned into equal-sized frames
- address specified as: BLOCK-FRAME ADDR | BLOCK-OFFSET ADDR
- goal of OS design is to maximize performance
- high miss rate is obviously bad, but can't use that by itself to measure
- use average time to access memory = hit time + miss rate * miss penalty 
- how to determine ideal frame size?
	- access time is constant with block size
	- transfer time is linear with block size
	- graph MISS PENALTY vs. BLOCK SIZE
	- larger blocks -> fewer blocks in memory (limited space)
		- exploits spatial locality - very few misses
		- but might find ourselves jumping between several blocks
		  that require replacement - can't support temporal locality
	-> miss rate drops with increasing block size then gradually picks up
- CPU must have some mechanism to determine when info is needed


The Big Questions


1. where is block placed? (placement)
2. how is a block found? (identification)
3. which block should be replaced on a miss? (replacement strategy)
4. what happens on a write? (write strategy)
	- for cache, write through to main memory?
	- for VM, when do dirty pages get written?

Block Placement:
	- direct mapped: block can only be in one place in cache:
		location = BFA mod #blocks-in-cache	
	- fully associative: block can be anywhere
	- set associative: block can be in a set of places (2 or more)
		location = BFA mod #sets-in-cache

Identification:
	- consider simple direct mapped scheme
	- sample physical address consists of 32 bits
	- 16 low-order bits broken down to 
		- 14 bit block location (which block # in cache)
		- 2 bit byte offset within block (block size = 65536 bytes)
	- 16 high-order bits used as identifying tag
		- if the physical address tag matches the tag associated
		  with this block in the cache, we have a hit, else miss