Storage Basics P. J. Denning For CS471/571 2001, P. J. Denning
All computers have memory hierarchies computational store (e.g., RAM, cache) long-term store (e.g., disks, CDs, tapes) Why? Speed and cost tradeoffs Different types of storage media Removable and portable storage media Volatility of RAM, persistence of disk Multiprogramming for greater efficiency of resource usage 2001, P. J. Denning 2
Multiprogramming Partition main memory among active jobs Fixed or variable partition? Fragmentation? (Checkerboarding) Hardware address translation? Hardware memory protection? 2001, P. J. Denning 3
1 2 3 4 unused yes four blocks compacted together occupying a multiprogrammed memory; unused portion at end new block 1 2 3 4 no same blocks scattered; unused space is fragmented and cannot permit loading the new block. 2001, P. J. Denning 4
Implications for Programmers Overlay problem Divide program into blocks Divide program time into phases Identify blocks that need to be present in each phase (subject to memory limit) Schedule fetches and replacements 40-50% of programming time on this 2001, P. J. Denning 5
space (segments) LOCALITY DIAGRAM time (phases) 2001, P. J. Denning 6
Blocks needing to be present: locality set of a phase. Fetch: insert instruction to bring block from disk to RAM Pre-fetch? Demand fetch? Replace: insert instruction to copy block from RAM to disk Pre-replace? Demand replace? Overlay sequence depends on memory size limit 2001, P. J. Denning 7
Paging Invented 1959 on Atlas Computer at University of Manchester Divide programs and data into fixed size blocks called pages Divide memory into fixed size (same) blocks called frames 2001, P. J. Denning 8
Paging 2 No (external) fragmentation Fetch on demand (cheaper than error-prone pre-fetching) Map address space to memory space 2001, P. J. Denning 9
Paging 3 artificial contiguity logical partitioning pages frames address space 2001, P. J. Denning 10
24 8 page no i line no x page table PT frame no f line no x 8 8 2001, P. J. Denning 11
2 24 pages 2 24 8 8 bytes/page page no i line no x PT frame no f line no x 8 8 2 8 frames 2001, P. J. Denning 12
24 8 page no i line no x VIRTUAL ADDRESS PT MAP frame no f line no x REAL ADDRESS 8 8 2001, P. J. Denning 13
PT[d] d = domain number each process has a domain i = page number P = presence bit i P M U A f M = modified bit U = used bit A = access code f = frame number 2001, P. J. Denning 14
CPU RAM dom d PT[d] i addr (i,x) x f up down 2001, P. J. Denning 15
CPU generates address (i,x) Objective: map to (f,x) in memory PT for CPU s domain d in RAM Copy of entire address space in file on disk (the swap file or cache file ) 2001, P. J. Denning 16
CPU RAM dom d PT[d] i addr (i,x) MMU (f,x) page fault f x up down 2001, P. J. Denning 17
Memory Mapping Unit (in CPU) translates microcommands (RW, i,x) On (RW,i,x) do E = PT[d,i] if not (E.A allows RW) then ACCESS FAULT if E.P=0 then PAGE FAULT (E.U, E.M) = (1, if RW=r then 0 else 1) PT[d,i]=E generate (PT.f, x) 2001, P. J. Denning 18
CPU RAM dom d PT[d] i addr (i,x) TLB MMU (f,x) page fault f x up down (i,f) 2001, P. J. Denning 19
Translation Look Aside Buffer (TLB) is cache of most recent address paths Holds most recent (i,f) pairs (Actually (M,i,f) triplets.) Bypass a PT lookup if a hit occurs. Hit ratio h. Miss ratio (1-h). Average mapping time: T = h*tlb + (1-h)*RAM Easy to get below 3% with a few hundred TLB entries 2001, P. J. Denning 20
CPU RAM dom d PT[d] i addr (i,x) TLB MMU (f,x) page fault f x up down (i,f) PFH 2001, P. J. Denning 21
Page Fault Handler (PFH) resolves missing page exceptions (i,x) for CPU in domain d: Using replacement policy: select frame (f); if frame modified, copy its occupying page to its disk home -- a down move. Load missing page (i) into that frame (f) -- an up move. Update PT[d,i]=(1,0,0,A,f) Return to interrupted process; it retries address and now succeeds Structure PFH to eliminate as many of the swaps as possible. 2001, P. J. Denning 22
Paging Manager contains the PFH routine and two daemons: Replace Daemon identifies pages to replace, marks them P=0, and signals WriteBack Daemon for pages with M=1. Replace Daemon targets to keep pool at a given minimum size. WriteBack Daemon copies (P,M)=(0,1) pages to disk, then sets M=0. 2001, P. J. Denning 23
Because many processes can encounter page faults in parallel, the PFH gets pages from a semaphore-protected pool and notifies Replace Daemon for each withdrawal. Replace Daemon may replace multiple pages to replenish pool. When it replaces a modified page, it notifies WriteBack Daemon. WriteBack Daemon copies each page back to disk and then puts it in the pool. 2001, P. J. Denning 24
FRAMES SETS (0,0) h t (0,1) h t f d i link (1,1) (1,0) h t h t Paging Manager maintains a FRAMES table telling what page occupies each frame. Each frame linked to one of four sets according to their (P,M) values. A missing page can be reclaimed without swap if FRAMES[PT[d,i].F]=(d,i). 2001, P. J. Denning 25
FRAMES SETS DISKMAP (0,0) h t (0,1) h t f d i link (1,1) (1,0) h t h t (d,i) DA PM also maintains a DISKMAP that tells the disk address of every page. 2001, P. J. Denning 26
The POOL contains all frames that are POOL available to receive the next incoming page. page fetch / reclaim (0,0) (1,0) replacement write back page write (0,1) Frames are divided into sets by (P,M) combinations. Paging Manager manages the state transitions. The reclaim action occurs on a page fault when page is reclaimable; it simply sets P=1. reclaim replacement (1,1) 2001, P. J. Denning 27
PFH(d,i): WAIT(pool) WAIT(pool) if if FRAMES[f=PT[d,i].f]=(d,i) then then {PT[d,i].P {PT[d,i].P = 1; 1; unlink unlink f from from SET[0,0]} SET[0,0]} else else { f = unlink unlink first first frame frame of of SET[0,0] SET[0,0] enter enter (self-pid, (self-pid, r, r, f, f, DISKMAP[d,i]) DISKMAP[d,i]) in in Disk Disk Work Work Queue Queue WAIT(self-sem) WAIT(self-sem) PT[d,i] PT[d,i] = (1,0,0,A,f) (1,0,0,A,f) poolsize-- poolsize-- SIGNAL(replace) } link link f to to SET[1,0] SET[1,0] return return Disk Driver protocol: enter request (self-pid, rw, a, b) in Disk Work Queue and wait on self-pid's private semaphore. Parameter a in caller's address space and b is a disk address. When Disk Driver done, it signals the private semaphore. 2001, P. J. Denning 28
Replace Daemon: 1: 1: WAIT(replace) WAIT(replace) while while poolsize poolsize < threshold threshold do do { use use replacement replacement rule rule to to unlink unlink frame frame f from from SET(1,-) SET(1,-) PT[FRAMES[f]].P=0 if if PT[FRAMES[f]].M = 1 then then sendmsg(writeback, f) f) else else {link {link f to to SET[0,0]; SET[0,0]; poolsize++; poolsize++; SIGNAL(pool)} SIGNAL(pool)} } goto goto 1 The system call sendmsg(p,m) places message m in the inbox of process p. The system call m=getmsg() in process p returns the message; or waits if the inbox is empty. 2001, P. J. Denning 29
WriteBack Daemon: 1: 1: f = getmsg() getmsg() enter enter (self-pid, (self-pid, w, w, f, f, DISKMAP[FRAMES[f]]) in in Disk Disk Work Work Queue Queue WAIT(self-sem) WAIT(self-sem) PT[FRAMES[f]].M = 0 link link f to to SET[0,0] SET[0,0] poolsize++ poolsize++ SIGNAL(pool) SIGNAL(pool) goto goto 1 The frame f had been marked not present by ReplaceDaemon. When copy back to disk is complete, f can be linked to the POOL. 2001, P. J. Denning 30
The Paging Manager is free of deadlock because every WAIT is followed by a SIGNAL that replenishes the count of the semaphore. In the PFH, the WAIT(pool) is followed by SIGNAL(replace); the Replace Daemon SIGNAL(pool) for an unmodified replacement page and WriteBack Daemon SIGNAL(pool) for a modified page. This structure gives parallelism between satisfying a fault, replenishing the pool, and writing a page back to disk. This increases the probability the pool is nonempty at the time of a fault. 2001, P. J. Denning 31
OTHER MAPPING METHODS Paging Segmentation Segmentation and paging Capability addressing 2001, P. J. Denning 32
Paging Address space linear -- all addresses integers from 0 to a max. Memory space also linear. Both address and memory space divided into equal size blocks -- pages in address space, frames in memory space. Pages and frames not visible to programmer. Mapping is simple and fast. 2001, P. J. Denning 33
Segmentation Address space is nonlinear (2-D) consisting of segments, each a definite size. Segments are good containers for objects defined in the program. Compiler assigns objects to segments. Addresses are of the form (s,x) -- segment number, offset. Offset cannot exceed segment length. Memory is linear, segments stored as contiguous blocks. External fragmentation a problem. 2001, P. J. Denning 34
CPU RAM dom d ST[d] s (P,M,U,A,B,L) addr (s,x) (s,x) B x L MMU TLB segment fault up down (s,b,l) SFH 2001, P. J. Denning 35
Segmentation Modes Automatic Mode: Programmer unaware of segment boundaries; compiler assigns objects to segments. Burroughs Algol machines (B5000, B6700) Manual Mode: Programmer aware of segments; gives symbolic names (S,X) in program; compiler translates to internal (s,x) codes. Multics (GE645) No major machines today 2001, P. J. Denning 36
Dynamic Linking In manual mode, new possibility arises: program contains symbolic reference (S,X) but no segment number has been assigned to S; S is a file and X a variable within S. Compiler can leave the symbolic reference in the code. When it is encountered for the first time, system generates a linkage fault. Linkage fault handler copies file S into a new segment s, creates ST entry for s, replaces the symbolic reference with its segment number, and returns control to the program. 2001, P. J. Denning 37
Segmentation + Paging Combines the two by dividing segments into pages and giving each segment its own page table. Eliminates the external fragmentation problem caused by variable size segments in RAM. More complex mapping with both ST and PT. 2001, P. J. Denning 38
CPU RAM dom d ST[d] PT[t] s (A,t) i (P,M,U,F) addr (s,x)=(s,(i,y)) TLB MMU (f,y) page fault f y up down (s,t,i,f) PFH 2001, P. J. Denning 39
MMU-TLB MMU-TLB interaction interaction TLB TLB entries entries (s,t,i,f) (s,t,i,f) MMU MMU interrogates interrogates TLB TLB with with key key (s,i) (s,i) Bypasses Bypasses ST ST and and PT PT lookups lookups if if it it finds finds a a full full match; match; then then it it gets gets f f immediately. immediately. Bypasses Bypasses ST ST lookup lookup if if it it finds finds partial partial match match on on segment segment number number s; s; then then it it must must lookup lookup PT PT entry entry i i to to get get f. f. Does Does both both ST ST and and PT PT lookups lookups if if it it finds finds no no match match on on segment segment number number s; s; then then it it must must lookup lookup both both ST ST and and PT PT entries entries to to get get f. f. Then Then MMU MMU generates generates its its output output (f,y). (f,y). 2001, P. J. Denning 40
Problem of Sharing How do two processes share a segment? Desire to share comes up after programs are written. Can we share without recompiling the programs? Difficult with pure segmentation. The parties do not need the same segment number because both their ST entries can point to the same (B,L) region. But if OS needs to move the segment or swap it out, it has to locate all ST and update their entries. Easier with segmentation-paging. The parties have different segment numbers pointing to the same PT. Any change to a page s position is recorded once, in the PT entry. All parties will see the change immediately. 2001, P. J. Denning 41
ST[d1] s1 B ST[d2] L s2 change change the the placement placement or or size size of of the the segment segment ==> ==> back-propagate back-propagate the the new new info info to to all all ST s ST s pointing pointing to to the the segment. segment. 2001, P. J. Denning 42
ST[d1] s1 t PT[t] f ST[d2] f s2 t Change Change the the placement placement of of the the page page ==> ==> Update Update just just one one PT PT entry. entry. No No back-propagation. back-propagation. 2001, P. J. Denning 43
Capability Addressing Generalize from segments to objects. Facilitate object sharing with location transparency. Capabilities = object handles recognized throughout a system or network; protected from alteration. Three-level mapping: symbolic name to capability, capability to descriptor, descriptor to object. 2001, P. J. Denning 44
OT[d1] o1 x DT B L OT[d2] x (B,L) o2 x Change Change the the placement placement or or size size of of the the object object x x ==> ==> o1 o1 and and o2 o2 are are compilerassigned compilerassigned codes codes for for symbolic symbolic names names chosen chosen by by the the user. user. Update Update just just one one descriptor descriptor (x). (x). DT DT a a large large hash hash table. table. Object Object ids ids (x) (x) are are used used just just once. once. 2001, P. J. Denning 45
Modern Capability Examples Object-oriented programming: Run-time system generates handles, which map through descriptors to objects. User chooses symbolic names for objects; run-time system associates those names with their handles. Three-level map: name handle handle descriptor descriptor object. 2001, P. J. Denning 46
E-Mail Systems: Handle: user@host (unique) Three-level mapping: Alias to handle (address book) Handle to IP address (DNS service) IP address to mailbox (IP) 2001, P. J. Denning 47
Handle Protocol (see handle.org): Give objects a location-dependent, unique name in the Internet. Handle: an object ID chosen within a hierarchical system that makes handle-ids unique. Three-level mapping: Document text reference to handle (handle://unique-id) Handle-id to hostname/pathname (handle server) hostname/pathname to object (http) 2001, P. J. Denning 48
2001, P. J. Denning 49