OS Kernels
Lecture 1
-
Will require C!
-
Will have a complicated C assignment
-
Office Hours in 105 Schorr, Tuesday 10:30-12:30
-
more low-level
-
HW (35%)
-
2 programming assignments
-
2 problem solving
-
-
Project (15%)
-
Pair programming
-
Trio programming (requires additional components)
-
-
Midterms 3x, (40% total)
-
Quizzes (10%)
-
Tips
-
Get to know classmates
-
participate!!
-
Prior to due dates, class time will be spent answering questions
-
-
Goals
-
knowledge of internals
-
OS roles
-
interaction with hardware
-
systems programming
-
memory management
-
concurrency
-
practical OS experience
-
-
Know what a process is
-
And how multiple processes work
-
thinking about shared resources
-
And things like race conditions
-
Thread-safety is important – all code must be thread safe!
Lecture 2: Review – Cache Memories
-
OS Objectives
-
Provides convenience
-
simplifies io
-
simplifies user interfaces
-
simplifies portabilities
-
-
manages resources
-
memory
-
processes
-
I/O devices
-
communications media
-
-
ease of evolution
-
support new hardware components
-
provides new services
-
supports maintenance
-
-
-
Major portion on resource management
-
review of memory hierarchy and cache
Memory Hierarchy and Cash
-
Memory Hierarchy
-
Registers
-
L1 Cache (SRAM)
-
L2 Cache (SRAM)
-
Main Memory (DRAM)
-
Local storage
-
remote storage
-
-
Smaller numbers are smaller in space, faster, but more expensive per byte
-
rule of locality – bring things closer as you use them
-
The Cache
-
A small, fast storage device – a staging area for data in a larger, slower device
-
for each $k$ the faster, smaller device at $k$ serves as cache for larger, slower $k+1$
-
Hiercharchies work because of locality – programs tend to access data at level k more often than k+1 – thus storage at level k+1 can be slower and thus larger and cheaper per bit
-
The memory hieracrrchy creates a large pool of storage that costs as much as cheap storage near the bottomm, serving data to programs at the rate of the fast storage near the top
-
SRAM – managed automatically in hardware – holds frequently accessed blocks of main memory
-
CPU looks for data in caches first, then in the main memory
-
data is copied in transfer blocks
-
Requires various ways of managing which blocks are in the cache in the cache, and what isn't
-
-
Cache concepts
-
Cache hits – When the cache has the block
-
Cache misses – when the cache doesn't have the block
-
Get from main memory
-
store in cache – using placement (determines where a block goes) and replacement (determines which block gets evicted) policies
-
Types
-
Cold start / Compulsory – cache is empty
-
conflict miss – limits blocks at level $k+1$ to a small subset of the blocks at $k$
-
occur when level $k$ cache is large enough, but multiple objects map to the same level k block
-
-
capacity miss – occurs when active blocks is larger than the cache
-
-
-
Faults – when the needed stuff isn't in main memory
-
Cache Organization
-
$S$ – set
-
$E$ – way/line per set
-
$B$ – block size
-
Block has $v$ (valid bit), tag, $B$ bytes
-
$C = SEB$ for cache size
-
Address is Tag, Set, Block offset
-
Cache is at the physical domain
-
CPU generally does cache/virtual addresses, not physical addresses
-
To read:
-
$B = 8$
-
$S = 4$
-
$E = 1$
-
Locate Set
-
Locate matching tag, and valid
-
Locate data starting at offset
-
-
Direct mapped cache, $E = 1$
-
when no match, evict line and replace
-
$E$-way set associative cache – two lines per set, means more storage/more compact, requires more comparison of the tag – no match is a bit harder, but requires a slightly different replacementt protocol
Lecture 3
Cache – Continued
-
understand matching and replacement
-
replacements policies – random, least recently used, etc
-
writebacks – happen to the original memory
-
write-through will automatically write to memory
-
write-back defers writing until the line is replaced – needs a dirty bit
-
write-misss
-
write-allocate – load into cache, update line
-
no-write-allocate – immediately write to memory
-
-
miss rate, hit time, miss penalty are the performance metrics
-
make sure to write s.t. the common case goes fast
-
minimize misses in the inner loops
-
repeated refs to vars are good
-
stride-1 ref patterns are good
-
Problems
-
physical address only
-
memory is byte addressable
-
access to 1 byte words
-
phys address is 12 bit wide
-
81 is fysically addressed and direct mapped with a 4-byte line and 16 total sets
Lecture 4: Virtual Memory
-
lets a computer think it has more memory than it actually does
-
helps to deal with a lot of more basic stuff
-
physical addressing – used in small systems – embeded microcontollers for things like cars, elevators, etc
-
can be an issue – a set amount of memory is provided, no more, no less
-
512mb means 512mb
-
programs are machine dependent
-
running multiple programs becomes very difficult
-
-
virtual memory uses an mmu to translate a virtual address to a physical address
-
means that it doesn't matter how much memory it has
-
-
linear address space – orderd set of contiguous, non negative integer address
-
virtual address space – set of $N = 2^n$ virtual addresses
-
physical address set of $M = 2^m$ physicall addresses
-
difference between data and attributes
-
an object can have multiple addresses
-
every byte in main memory has one PA and one or more VAs
-
virtual memory uses main memory efficiently
-
simplifies memory management
-
isolates address space – one process can't interfere with another's, and a user program cannot access privileged kernel information
Virtual memory as a cache
-
VM is an array of N contiguous byets stored on disk
-
contents of the array are cached in physical memory
-
blocks are called pages (size $P = 2^p$ bytes)
-
pages are of fixed, but configurable size
-
Given an address space of $2^n$, and a page size of $2^m$, you have $2^{n-m}$ pages
-
keeping track requires special software, cannot be implemented in hardware. This is the page table
-
write-back, not write-through
-
Page tables
-
an array of page table entries (PTEs) that map virtual pages to physical pages
-
per-process kernel data structure, in DRAM
-
Has valid bit, keeps physical page number or disk address
-
valid signals on disk
-
works because of locality
-
actie pages is called the working set
-
working set < main size, good performance for one process after misses
-
sum(working set) > main size, thrashing, lots of swapping
-
Virtual memory for memory management
-
each process has its own virtual address space – own page table
-
views memory as a simple linear array
-
mapping function scatters through physical memory
-
maps virtual pages to the same physical page
-
and can allow a single library to be used for multiple programs
-
can make linking and loading easier
-
kernel vm is shared
-
linking maps programs into the common structure
-
loading allocates and copies things as needed
Lecture 5: Virtual Memory
-
homework is a tool
-
quiz based
-
midterm will be electronic?
-
first midterm, programming assignment coming, week, week and a half
Virtual memory for memory management
-
use chunks to bring stuff up!
-
everything has its own virtual address space
-
mapping scatters addresses through physical memory
-
well chosen mappings simplify allocation and management
-
includes protecting sections from exeution
Memory protection
-
page fault bits, superfior, read, write, block address
Address Translation
-
$N = 2^n$ number of addresses in virtual address space
-
$M = 2^m$ number of addresses in physical memory
-
$P = 2^p$ page size
-
TLBI – tlb index
-
TLBT – tlb tag
-
VPO – virtual page offset
-
VPN – virtual page number
-
PPO – Physical page offset
-
PPN – Physical page number
-
CO – byte offset
-
CI – cache index
-
CT – cache tag
-
given base address, use page number and offset
-
mmu translates based on ptable entry/address
-
VM and cache together make things faster/more able to handle a lot of memory
-
TLB – another cache in the MMU, maps vpns to ppns, contains complete page table entries for a small number of pages
-
multilevel page tables, would need 512 gb page table for 48 bit, 8 bite PTE at 4kb page size
-
paging the page table!
-
page tables that list page tables
-
Lecture 6: Virtual Memory
Page faults
-
small page size – less internal fragmentation
-
more pages required
-
larger page tables
-
large programs means page tables may be in virtual memory
-
rotational characteristics favor larger page size
-
-
4k is the most common size
-
internal fragmentation
-
10 seats, 10 students - no fragmentation
-
20 seats, 10 students - fragmentation of 10
-
40 seats, 10 students - fragmentation of 30
-
-
Small size with small pages – high page fault rate, but faster to fill
-
large size, large pages – low page fault rate, slower to fill
-
many pages in process, large page size, low page fault
-
contemporary techniques decrease locality of references within a process
-
Fetch policy – when pages should be brought into memory
-
demand paging – when the page is needed, as they are needed
-
many page faults when a process is first started
-
-
prepaging – load what is needed, and then a bit more
-
exploits characteristics of second memory devices
-
more efficient to bring in a number of pages at one time
-
ineffective if extra pages aren't referenced
-
-
-
placement policy
-
determines where stuff is to be put
-
-
replacement policy
-
deals with selection of a page in main memory to be replaced
-
objective is to remove the page least likely to be referenced in the near future
-
more elaborate policies require greater overhead
-
-
frame locking
-
a frame that must stay in memory at al time
-
the kernel, and key control structures
-
i/o buffers and time-critical areas may be locked
-
locking is achieved by lock bit with the frame
-
-
algorithms
-
optimal (Belady's policy)
-
selects the page for which the time to next ref is longest, produces three page faults after the frame allocation has been filled
-
-
FIFO
-
page frames a circular buffer
-
first thing in gets pushed out
-
-
least recently used
-
based on locality
-
least likely to be used is the least recently used
-
difficult to implement
-
-
clock policy
-
requires a use bit
-
set on load
-
reference
-
replacement if not 1
-
flipped periodically
-
-
-
-
Desireable to maintain as many processes in memory as possible
-
frees programmers from size restrictions
-
virt mem
-
references are logical, translated
-
breaks processes into pieces
-
two approaches are paging and segmentation
-
management scheme requires both hardware and software
-
Lecture 7: Virtual Memory
-
Considere a paged VM system witth 32 bit virt addresses and 2K-byte pages. Each page table entry requires 32 bits. Each page table size is exactly one page (or 2K-bytes)
-
how many levels of page tables
-
how many entries does the table use at each level (at one level, the table uses a smaller number of entries)
-
smaller number of entries can be used at the top or botom. Which strategy uses the least number of pages? Explain by calculating the number of pages for both
-
2^32/2^11 = 2^21
-
2^11/2^2 = 2^9, then, 3 levels, (9 + 9 + 3) = 21
-
entries per level: 512, 512, 8
-
Dynamic Memory allocation
-
basic concepts
-
malloc etc acquires vm at run time
-
manage an area of process vm known as the heap
-
allocator maintains heap as collection of variable sized blocks, either allocated or free
-
types
-
explicit – malloc and free in C
-
implicit – application allocates, doesn't free, GC in Java, ML, Lisp
-
-
-
void *malloc(size_t size)
– returns void pointer, memory blocks of at least size, aligned to 8-byte boundary (usually), returns null, sets errno -
void free(void *p)
– returns block pointed at p to pool of available memory – must come from pervious call to malloc -
sbrk
is internal function -
memory is word addressed
-
applications – can issue arbit sequence of malloc and free
-
free must be called on a malloc'd block
-
cant control number or size of allocated blocks
-
must respond immediately to malloc
-
must allocate from free memory
-
must align blocks to satisfy all alignment requirements (8 byte for GNU Malloc)
-
can manipulate and modify only free memory
-
can't move the allocated blocks once malloc'd
-
given sequence of malloc and free,
-
maximize throughput and peak memory utilization
-
throughput is number of completed requests per unit time
-
payload, malloc(p) results in a block with payload of p bytes
-
aggregate payload is the sum of the currently allocated payloads
-
currentheap size is non-decreasing
-
peak utilization after kay requests max payload over heap size
-
internal fragmentation is when the payload is smaller than the block size
-
enough aggregate memory, but not enough contiguous is external framentation
-
compacting the stack can be done, but requires a tremendous amount of work/linear search of stack/memory/introspection
-
size is embeded in a block
-
keep track of free blocks – some structure in the heap
-
picking blocks for allocation
-
reinserted freed blocks?
-
standard block – length ob flock in word preceding the block (header field, header), requires an extra word for every allocated block
-
implicit list using length (links all blocks)
-
explicit list (lists free blocks)
-
segregated free list
-
different free lists for different size classes
Lecture 8: Malloc
-
how much memory to free – block before is the size
-
implicit list - use length/links of all blocks – implemented in K&R
-
segregated free list – free list for different size classes
-
blocks sorted by size – use RB tree with pointers to each block, link used as key
-
The implicit list
-
size and allocation status – could store in two words, but this is wasteful
-
use word for block size, low bit is status (because blocks are a multiple of an even number)
-
status word, payload, padding fulfill alignment
-
alignment with 8
-
while not past end, and already allocated, and too small, go to the next block
-
next fit, first fit, but starts where previous search finished
-
best fit, looks for the smallest block with appropriate size
-
allocation in a free block – split a block
-
freeing – simply clear allocated flag
-
or coalesce – check the next/previous block if its free
-
boundary tags – size tag/status flag at both ends, traverse list backwards, requires extra space, important, general technique, Knuth 73
-
extra internl frag with boundary tags
-
-
-
Placement – first fit, next fit, best fit
-
splitting – how much internal frag
-
coalescing – immediate or deferred
-
often used in special purpose apps, easy to implement
-
-
The explicit free list
-
allocated is as before, free has size, allocation status, and next previous are pointers stored in the first two blocks of the free
-
allocation requires moving previous and next for several
-
faster when memory is more full
-
more complex free/allocate because blocks need to be spliced in/out of the list
-
extra space required by links
-
commonly used with segregated free lists
-
Lecture 9: Malloc Continued
-
programming assignment coming
-
Explicit List
-
splice out predecessor and successor blocks, insert new block at root of list, coalesce if needed
-
explicit – only include free blocks
-
linked lists are mostly used in segregated free lists
-
segragated free list
-
blocks sorted by size – rb tree, or similar to use length as a key
-
size classes of blocks have own free list
-
large sizes – one for each two power size
-
Search for appropriate free list for block of size $m > n$
-
if an appropriate block is found
-
split block and place on appropriate list
-
-
If no block found, try next larger class
-
repeat until block found
-
If no block is yet found – request additional heap memory from os (sbrk)
-
allocate block of $n$ bytes from this new memory
-
place remainder as a single free block in largest size class
-
-
-
Memory perils and pitfalls
-
dereferencing bad pointers
-
reading uninitialized memory
-
overwiriting memory
-
referencing non existent variables
-
freeing mulitple times
-
referencing freed blocks
-
failing to free blocks
-
be careful of various operators and their precedence
-
Lecture 10: Garbage Collection!
-
Function pointers are used to implement dynamic loading and in some OO languages, v-tables
Garbage Collection
-
OO and Functional Programming languages
-
storage is still exhaustable
-
Let the runtime handle it
-
less debugging
-
no leaks
-
no explicit management
-
-
malloc function, new operator
-
used to allocate neded memory
-
-
mutator – does useful work, chages or mutates connectivity of object graph
-
gc – shift responsibility of freeing to systems
-
static allocation – binding at compile time – local vars bound to the same locations at every activation of the procedure
-
heap allocation – allocated and deallocated in any order, outlives method creating it, real-world data structures
-
liveness – held in roots, can be detected
-
memory is now a directed graph
-
non-reachable is garbage
-
nodes are reachable if there is a path from any root to that node
-
initialization is part of new – which acts like malloc otherwise
-
ref count – the number of objects that refer to the current object
-
harder to determine deallocateability
-
but les time is spent debugging memory
-
Classical algorithms
-
Reference counting
-
counts number of references to each object
-
incremental
-
Perl, photoshop, small talk, VB
-
update(right(m), null)
-
recursively decrement all of the things below it, and update anything below again
-
-
management distributed through execution
-
better locality than tracing
-
cells that are short lived can be reused to reduce the amount of heap usage
-
but fragile, and doesn't deal with cyclic structures
-
-
Mark sweep
-
dead objects not reclaimed immediately
-
begins when a request is made that results in memory exhaustion
-
useful info is suspended
-
sweep unused objects back to list
-
global traversal of all live objects starting from root
-
sweap from bottom up
-
unmarked cells returned to the free pool
-
marking bits are unset in prep for next gc
-
Start/stop
-
complexity proportional to size of heap
-
more fragmentation/reduced locality
-
requires head-room
-
-
Copying algorithm
-
split into from and to spaces
-
allocation through pointer mubing in from space
-
live identification is done through tracing
-
live object copied to other side of heap
-
once copied, tags for spaces are flipped
-
fast allocation, simple check for exhaustion
-
no fragmentation
-
poor space utilization
-
more page faults
-
-
Generational gc
-
the defacto
-
uses multiple kinds
-
-
Lecture 11: Process Management
-
difference between programs and processes
-
program is a lifeless object, sits on disk, just instructions
-
a process is a running program
-
OS takes the instructions and gets them running
-
transforms the program into something useful
-
-
issues with programs running on different OS – because of memory differences, abis etc
-
creation
-
program loaded from disk
-
stored to memory (code and data), stack and heap
-
i/o – create file descriptors
-
execute, starting at entry point (main)
-
-
System to be initialized
-
user requests a new process
-
initiation of batch jobs
-
be very careful of fork
-
termination conditions
-
normal exit
-
error exit
-
fatal error
-
killed by another process
-
-
Parents create children, children can create children
-
forming a hierarchy, called a process group (except on windows)
-
processes may be either running or not running (suspesion is used to wait for other things)
-
has to use a queue – kinda
Lecture 12: Process Management, continued
-
the two process model
-
running or not running
-
running requires availability of cpu resources
-
many state models
-
two state model
-
running, and not running, looped together
-
not running is entry point
-
exit is exit point from running
-
-
not-running-queue
-
uses a queue for not yet running
-
-
three state model
-
ready, blocked, running
-
running to blocked, or to ready
-
ready to running
-
blocked to ready
-
lowest layer of a process-structured os
-
handles interrupts and scheduling
-
above the scheduler are sequential processes
-
can preempt, switch, etc
-
-
scheduler preemption
-
save program counter
-
register values
-
memory, etc
-
-
five state
-
running, ready, new, blocked, exit
-
new admits to ready,
-
ready dispatches to running
-
running ties out to ready
-
events wait from running o blocked
-
blocked occcurs tto ready
-
running releases to exit
-
two/multiple queues
-
blocked queue
-
ready queue
-
a queue for each event waiting type
-
-
-
process suspension
-
processor is faster than i/o
-
swap processes to disk to free memory
-
blocked state becomes suspend when swapped to disk
-
blocked/suspend, ready/suspend
-
-
process suspension may happen for a variety of reasonns
-
i/o tables
-
file tables
-
status tables
-
process structure points to where various things are
-
registers
-
pc
-
program status
-
stack pointer
-
process stae
-
priority
-
sched params
-
pid
-
parent
-
process group
-
a variety of others
-
-
shared memory can be used to communicate/coordinate between processes
-
Threading
-
lighter weight than processes
-
but have a similar purpose
-
keep the same process, execution context
-
and only are a specific function
-
can still lead to race conditions
-
can cause issues w/ stack space
-
Lecture 13: Process Examples
-
virtual addresses, physical addresses must be mapped for shared memory
-
know how many processes are in play
-
remember that process have parent/child relationships
Threading
-
miniature processes – part of one, same resources as the original, but new execution context (new stack space for the thread)
-
may/does require dynamic allocation on the stack
-
multi threading helps to utilize as much of the processor as possible
-
threads can be implemented either in userspace or by the kernel
-
requires scheduling, a thread table, etc
-
-
kernel threads are now the norm – makes things much easier to handle
-
good concurrency support – blocking thread doesn't starve siblings
-
scheduling needs system calls
-
not as fleixble
-
not as scalable
-
-
hybrid implementations – part kernel hread, part user threads
-
scheduler activations
-
mimc kernel threads, gaining performance of user threads
-
avoid unnecessary user/kernel transitions
-
kernel assigns virt processor to each process
-
relies on the kernel, to notify of kernel blocking events
-
more virt processors are assigned in blocking
-
-
mysql
-
popup threads – new threads on message arrival
-
some before, but most after
-
-
threads must be reentrant
-
may have private globals
-
and static locals
-
Lecture 14: Threads, Continued
-
create a thread and start routine
-
exit simply terminates a thread
-
pthread_join suspends execution of calling thread until the target thread terminates, unless the target thread has already terminated
-
Be careful with how sync is done.
-
And consider how syncing in threads is done
-
Mutexs, flags, semaphores, etc
Lecture 16: More Concurrency
-
Print Queue – $N = 100$
-
Processes add print jobs to the queue, but can't add more to a full queue
-
Processes remove jobs from the queue and do the printing, bu can't consume from the empty queue
-
Some buffers are circular
-
When buffer is full, producers go to sleep
-
When empty, consumers go to sleep
-
If empty buffer gets added to, wake consumer; if full buffer gets removed, wake producers
-
Updates may, and likely do, require a mutex
-
Semaphores – A number, 0 or a postive number
-
down – if val > 0, decremet, if val = 0, suspend without completing down
-
up – increment and if there are processes sleeping on the semaphore, wake one or all up
-
-
Monitors
-
Semaphores are used for mutual exclusion
-
barriers and signalling events
-
low level
-
-
Monitors provide mutual exclusion implicity
Lecture 17: Monitor
-
Semaphores as barriers
-
Monitor is a language construct
-
Only one process can be active in a monitor at anny instant
-
Use condition variables to block processes
-
wait operations
-
signal operations
-
See slide 11 from class
-
produce-consumer problem, only one process can be active at one time
-
buffer has $n$ slots
-
implementation independent
-
only visible effects matter
-
callers are independent
-
monitor writers can change the iplementation as long as tthe same effects are maintained
-
provides mutual exclusion implicitly
-
condition variables
-
delay a process if a monitor state fails to satisfy boolean condition
-
awaken a process or processes if the condition becomes true
-
-
signaling disciplines
-
signal continue signal wait
-
Helps to solve race conditions
Lecture 18: Deadlock
-
When there's a shared resource tthat you only need a part of
-
many different methods of dealing with deadlock
-
happens on mutual exclusion
-
no preemption
-
circular waiting
-
mutual exclusions hel to avoid race conditions
-
hold-and-wait – must request all resources initially
-
no preemption – unable to do well
-
circular wait – define linear ordering of resource types
-
Most OS use the ostrich's approach – head in sand
-
Use the banker's algorithm to help control allocation/release
-
max requirement staed in advance
-
process must be independen – no synchronization requirement
-
fixed number of resources
-
no process may exit while holding resources
-
detect by
-
marking each process that has a roww of all zeroes
-
initials a tem vector to equal the available vector
-
find an $i$ such that $i$ is unmarked and the $i$th row of $Q \leq W$
-
if does not exist, terminate
-
if exists, mark and add to row of $A$
-
-
Lecture 20: Dealing with Interrupts
-
be careful of how synchronus/asynchronous function calls happen
-
interrupts are handled by function pointers
-
execution context
-
alt_exception_entry.s
– handles changing execution context, i.e., interrupts -
stores return address, exception handling exit
-
stores current stack pointer, all registers, etc
-
Loads and stores into the stack
-
-
Stack pointer points to the last used slot
-
FP points to the last saved frame
-
In multi-threading, you can just go to the next thread on return from a interrupt handler
-
Next up – scheduling algorithms
-
Signal and continue – signaler continues and process executes at some later time
Lecture 21: Uniprocessor Scheduling
-
Assign processes to be executed
-
minimize response time
-
maximize throughput and processor efficiency
-
Long term scheduling – determines which wich are admitted, determines degree of multiprogramming
-
Medium-term – swapping function, manage degree of multiprogramming and available memory
-
short-term scheduling – dispatcher, executes most frequently, when events occur (clock, i/o interrupts, OS calls, signals)
-
user-oriented by response time, which should be minimized
-
throughput for the system
-
turnaround time – interval between submission and completion – batch processing
-
response time – submission and initial response
-
deadline – tife for a process to complete maximize percentage of deadlines met
-
predictability – job should take about the same time to run and incur the same cost on a system regardless of load
-
-
Multiple queues based on what is being waited on
-
Schedulers often use priorities – low priority may suffer starvation
-
First Come First served – first process
-
Arrival and time Required
-
Look at like a Gant chart
-
Pick oldest in queue to go first
-
Short processes may have to wait a long time before it can execute
-
-
Round Robin – Premption based on a clock
-
Amount of timedetermined that allows each process to use the processor for that time
-
Time slicing
-
-
Shortes process next
-
shortest remaining time
-
highest response ratio next – choose based on time waiting plus service time over service time
-
Multilevel feedback queue – penalize jobs that have been running longer, don't know remaining time process needs to execute
Lecture 22: Scheduling
-
Use context library to implement thread scheduler
-
Context should be pretty helpful