Notes

Notes, etc. of Samuel Flint

OS Kernels

Lecture 1

  • Will require C!

  • Will have a complicated C assignment

  • Office Hours in 105 Schorr, Tuesday 10:30-12:30

  • more low-level

  • HW (35%)

    • 2 programming assignments

    • 2 problem solving

  • Project (15%)

    • Pair programming

    • Trio programming (requires additional components)

  • Midterms 3x, (40% total)

  • Quizzes (10%)

  • Tips

    • Get to know classmates

    • participate!!

    • Prior to due dates, class time will be spent answering questions

  • Goals

    • knowledge of internals

    • OS roles

    • interaction with hardware

    • systems programming

    • memory management

    • concurrency

    • practical OS experience

  • Know what a process is

  • And how multiple processes work

  • thinking about shared resources

  • And things like race conditions

  • Thread-safety is important – all code must be thread safe!

Lecture 2: Review – Cache Memories

  • OS Objectives

    • Provides convenience

      • simplifies io

      • simplifies user interfaces

      • simplifies portabilities

    • manages resources

      • memory

      • processes

      • I/O devices

      • communications media

    • ease of evolution

      • support new hardware components

      • provides new services

      • supports maintenance

  • Major portion on resource management

  • review of memory hierarchy and cache

Memory Hierarchy and Cash

  • Memory Hierarchy

    1. Registers

    2. L1 Cache (SRAM)

    3. L2 Cache (SRAM)

    4. Main Memory (DRAM)

    5. Local storage

    6. remote storage

  • Smaller numbers are smaller in space, faster, but more expensive per byte

  • rule of locality – bring things closer as you use them

  • The Cache

    • A small, fast storage device – a staging area for data in a larger, slower device

    • for each $k$ the faster, smaller device at $k$ serves as cache for larger, slower $k+1$

    • Hiercharchies work because of locality – programs tend to access data at level k more often than k+1 – thus storage at level k+1 can be slower and thus larger and cheaper per bit

    • The memory hieracrrchy creates a large pool of storage that costs as much as cheap storage near the bottomm, serving data to programs at the rate of the fast storage near the top

    • SRAM – managed automatically in hardware – holds frequently accessed blocks of main memory

    • CPU looks for data in caches first, then in the main memory

    • data is copied in transfer blocks

    • Requires various ways of managing which blocks are in the cache in the cache, and what isn't

  • Cache concepts

    • Cache hits – When the cache has the block

    • Cache misses – when the cache doesn't have the block

      • Get from main memory

      • store in cache – using placement (determines where a block goes) and replacement (determines which block gets evicted) policies

      • Types

        • Cold start / Compulsory – cache is empty

        • conflict miss – limits blocks at level $k+1$ to a small subset of the blocks at $k$

          • occur when level $k$ cache is large enough, but multiple objects map to the same level k block

        • capacity miss – occurs when active blocks is larger than the cache

    • Faults – when the needed stuff isn't in main memory

Cache Organization

  • $S$ – set

  • $E$ – way/line per set

  • $B$ – block size

  • Block has $v$ (valid bit), tag, $B$ bytes

  • $C = SEB$ for cache size

  • Address is Tag, Set, Block offset

  • Cache is at the physical domain

  • CPU generally does cache/virtual addresses, not physical addresses

  • To read:

    • $B = 8$

    • $S = 4$

    • $E = 1$

    • Locate Set

    • Locate matching tag, and valid

    • Locate data starting at offset

  • Direct mapped cache, $E = 1$

  • when no match, evict line and replace

  • $E$-way set associative cache – two lines per set, means more storage/more compact, requires more comparison of the tag – no match is a bit harder, but requires a slightly different replacementt protocol

Lecture 3

Cache – Continued

  • understand matching and replacement

  • replacements policies – random, least recently used, etc

  • writebacks – happen to the original memory

  • write-through will automatically write to memory

  • write-back defers writing until the line is replaced – needs a dirty bit

  • write-misss

    • write-allocate – load into cache, update line

    • no-write-allocate – immediately write to memory

  • miss rate, hit time, miss penalty are the performance metrics

  • make sure to write s.t. the common case goes fast

  • minimize misses in the inner loops

    • repeated refs to vars are good

    • stride-1 ref patterns are good

Problems

  • physical address only

  • memory is byte addressable

  • access to 1 byte words

  • phys address is 12 bit wide

  • 81 is fysically addressed and direct mapped with a 4-byte line and 16 total sets

Lecture 4: Virtual Memory

  • lets a computer think it has more memory than it actually does

  • helps to deal with a lot of more basic stuff

  • physical addressing – used in small systems – embeded microcontollers for things like cars, elevators, etc

    • can be an issue – a set amount of memory is provided, no more, no less

    • 512mb means 512mb

    • programs are machine dependent

    • running multiple programs becomes very difficult

  • virtual memory uses an mmu to translate a virtual address to a physical address

    • means that it doesn't matter how much memory it has

  • linear address space – orderd set of contiguous, non negative integer address

  • virtual address space – set of $N = 2^n$ virtual addresses

  • physical address set of $M = 2^m$ physicall addresses

  • difference between data and attributes

  • an object can have multiple addresses

  • every byte in main memory has one PA and one or more VAs

  • virtual memory uses main memory efficiently

  • simplifies memory management

  • isolates address space – one process can't interfere with another's, and a user program cannot access privileged kernel information

Virtual memory as a cache

  • VM is an array of N contiguous byets stored on disk

  • contents of the array are cached in physical memory

  • blocks are called pages (size $P = 2^p$ bytes)

  • pages are of fixed, but configurable size

  • Given an address space of $2^n$, and a page size of $2^m$, you have $2^{n-m}$ pages

  • keeping track requires special software, cannot be implemented in hardware. This is the page table

  • write-back, not write-through

  • Page tables

    • an array of page table entries (PTEs) that map virtual pages to physical pages

    • per-process kernel data structure, in DRAM

    • Has valid bit, keeps physical page number or disk address

    • valid signals on disk

    • works because of locality

    • actie pages is called the working set

    • working set < main size, good performance for one process after misses

    • sum(working set) > main size, thrashing, lots of swapping

Virtual memory for memory management

  • each process has its own virtual address space – own page table

  • views memory as a simple linear array

  • mapping function scatters through physical memory

  • maps virtual pages to the same physical page

  • and can allow a single library to be used for multiple programs

  • can make linking and loading easier

  • kernel vm is shared

  • linking maps programs into the common structure

  • loading allocates and copies things as needed

Lecture 5: Virtual Memory

  • homework is a tool

  • quiz based

  • midterm will be electronic?

  • first midterm, programming assignment coming, week, week and a half

Virtual memory for memory management

  • use chunks to bring stuff up!

  • everything has its own virtual address space

  • mapping scatters addresses through physical memory

  • well chosen mappings simplify allocation and management

  • includes protecting sections from exeution

Memory protection

  • page fault bits, superfior, read, write, block address

Address Translation

  • $N = 2^n$ number of addresses in virtual address space

  • $M = 2^m$ number of addresses in physical memory

  • $P = 2^p$ page size

  • TLBI – tlb index

  • TLBT – tlb tag

  • VPO – virtual page offset

  • VPN – virtual page number

  • PPO – Physical page offset

  • PPN – Physical page number

  • CO – byte offset

  • CI – cache index

  • CT – cache tag

  • given base address, use page number and offset

  • mmu translates based on ptable entry/address

  • VM and cache together make things faster/more able to handle a lot of memory

  • TLB – another cache in the MMU, maps vpns to ppns, contains complete page table entries for a small number of pages

  • multilevel page tables, would need 512 gb page table for 48 bit, 8 bite PTE at 4kb page size

    • paging the page table!

    • page tables that list page tables

Lecture 6: Virtual Memory

Page faults

  • small page size – less internal fragmentation

    • more pages required

    • larger page tables

    • large programs means page tables may be in virtual memory

    • rotational characteristics favor larger page size

  • 4k is the most common size

  • internal fragmentation

    • 10 seats, 10 students - no fragmentation

    • 20 seats, 10 students - fragmentation of 10

    • 40 seats, 10 students - fragmentation of 30

  • Small size with small pages – high page fault rate, but faster to fill

  • large size, large pages – low page fault rate, slower to fill

  • many pages in process, large page size, low page fault

  • contemporary techniques decrease locality of references within a process

  • Fetch policy – when pages should be brought into memory

    • demand paging – when the page is needed, as they are needed

      • many page faults when a process is first started

    • prepaging – load what is needed, and then a bit more

      • exploits characteristics of second memory devices

      • more efficient to bring in a number of pages at one time

      • ineffective if extra pages aren't referenced

  • placement policy

    • determines where stuff is to be put

  • replacement policy

    • deals with selection of a page in main memory to be replaced

    • objective is to remove the page least likely to be referenced in the near future

    • more elaborate policies require greater overhead

  • frame locking

    • a frame that must stay in memory at al time

    • the kernel, and key control structures

    • i/o buffers and time-critical areas may be locked

    • locking is achieved by lock bit with the frame

  • algorithms

    • optimal (Belady's policy)

      • selects the page for which the time to next ref is longest, produces three page faults after the frame allocation has been filled

    • FIFO

      • page frames a circular buffer

      • first thing in gets pushed out

    • least recently used

      • based on locality

      • least likely to be used is the least recently used

      • difficult to implement

    • clock policy

      • requires a use bit

        • set on load

        • reference

        • replacement if not 1

        • flipped periodically

  • Desireable to maintain as many processes in memory as possible

  • frees programmers from size restrictions

  • virt mem

    • references are logical, translated

    • breaks processes into pieces

    • two approaches are paging and segmentation

    • management scheme requires both hardware and software

Lecture 7: Virtual Memory

  • Considere a paged VM system witth 32 bit virt addresses and 2K-byte pages. Each page table entry requires 32 bits. Each page table size is exactly one page (or 2K-bytes)

    • how many levels of page tables

    • how many entries does the table use at each level (at one level, the table uses a smaller number of entries)

    • smaller number of entries can be used at the top or botom. Which strategy uses the least number of pages? Explain by calculating the number of pages for both

    • 2^32/2^11 = 2^21

    • 2^11/2^2 = 2^9, then, 3 levels, (9 + 9 + 3) = 21

    • entries per level: 512, 512, 8

Dynamic Memory allocation

  • basic concepts

    • malloc etc acquires vm at run time

    • manage an area of process vm known as the heap

    • allocator maintains heap as collection of variable sized blocks, either allocated or free

    • types

      • explicit – malloc and free in C

      • implicit – application allocates, doesn't free, GC in Java, ML, Lisp

  • void *malloc(size_t size) – returns void pointer, memory blocks of at least size, aligned to 8-byte boundary (usually), returns null, sets errno

  • void free(void *p) – returns block pointed at p to pool of available memory – must come from pervious call to malloc

  • sbrk is internal function

  • memory is word addressed

  • applications – can issue arbit sequence of malloc and free

  • free must be called on a malloc'd block

  • cant control number or size of allocated blocks

  • must respond immediately to malloc

  • must allocate from free memory

  • must align blocks to satisfy all alignment requirements (8 byte for GNU Malloc)

  • can manipulate and modify only free memory

  • can't move the allocated blocks once malloc'd

  • given sequence of malloc and free,

  • maximize throughput and peak memory utilization

  • throughput is number of completed requests per unit time

  • payload, malloc(p) results in a block with payload of p bytes

  • aggregate payload is the sum of the currently allocated payloads

  • currentheap size is non-decreasing

  • peak utilization after kay requests max payload over heap size

  • internal fragmentation is when the payload is smaller than the block size

  • enough aggregate memory, but not enough contiguous is external framentation

  • compacting the stack can be done, but requires a tremendous amount of work/linear search of stack/memory/introspection

  • size is embeded in a block

  • keep track of free blocks – some structure in the heap

  • picking blocks for allocation

  • reinserted freed blocks?

  • standard block – length ob flock in word preceding the block (header field, header), requires an extra word for every allocated block

  • implicit list using length (links all blocks)

  • explicit list (lists free blocks)

  • segregated free list

  • different free lists for different size classes

Lecture 8: Malloc

  • how much memory to free – block before is the size

  • implicit list - use length/links of all blocks – implemented in K&R

  • segregated free list – free list for different size classes

  • blocks sorted by size – use RB tree with pointers to each block, link used as key

  • The implicit list

    • size and allocation status – could store in two words, but this is wasteful

    • use word for block size, low bit is status (because blocks are a multiple of an even number)

    • status word, payload, padding fulfill alignment

    • alignment with 8

    • while not past end, and already allocated, and too small, go to the next block

    • next fit, first fit, but starts where previous search finished

    • best fit, looks for the smallest block with appropriate size

    • allocation in a free block – split a block

    • freeing – simply clear allocated flag

      • or coalesce – check the next/previous block if its free

        • boundary tags – size tag/status flag at both ends, traverse list backwards, requires extra space, important, general technique, Knuth 73

        • extra internl frag with boundary tags

    • Placement – first fit, next fit, best fit

    • splitting – how much internal frag

    • coalescing – immediate or deferred

    • often used in special purpose apps, easy to implement

  • The explicit free list

    • allocated is as before, free has size, allocation status, and next previous are pointers stored in the first two blocks of the free

    • allocation requires moving previous and next for several

    • faster when memory is more full

    • more complex free/allocate because blocks need to be spliced in/out of the list

    • extra space required by links

    • commonly used with segregated free lists

Lecture 9: Malloc Continued

  • programming assignment coming

  • Explicit List

  • splice out predecessor and successor blocks, insert new block at root of list, coalesce if needed

  • explicit – only include free blocks

  • linked lists are mostly used in segregated free lists

  • segragated free list

    • blocks sorted by size – rb tree, or similar to use length as a key

    • size classes of blocks have own free list

    • large sizes – one for each two power size

    • Search for appropriate free list for block of size $m > n$

      • if an appropriate block is found

        • split block and place on appropriate list

      • If no block found, try next larger class

      • repeat until block found

      • If no block is yet found – request additional heap memory from os (sbrk)

      • allocate block of $n$ bytes from this new memory

      • place remainder as a single free block in largest size class

  • Memory perils and pitfalls

    • dereferencing bad pointers

    • reading uninitialized memory

    • overwiriting memory

    • referencing non existent variables

    • freeing mulitple times

    • referencing freed blocks

    • failing to free blocks

    • be careful of various operators and their precedence

Lecture 10: Garbage Collection!

  • Function pointers are used to implement dynamic loading and in some OO languages, v-tables

Garbage Collection

  • OO and Functional Programming languages

  • storage is still exhaustable

  • Let the runtime handle it

    • less debugging

    • no leaks

    • no explicit management

  • malloc function, new operator

    • used to allocate neded memory

  • mutator – does useful work, chages or mutates connectivity of object graph

  • gc – shift responsibility of freeing to systems

  • static allocation – binding at compile time – local vars bound to the same locations at every activation of the procedure

  • heap allocation – allocated and deallocated in any order, outlives method creating it, real-world data structures

  • liveness – held in roots, can be detected

  • memory is now a directed graph

  • non-reachable is garbage

  • nodes are reachable if there is a path from any root to that node

  • initialization is part of new – which acts like malloc otherwise

  • ref count – the number of objects that refer to the current object

  • harder to determine deallocateability

  • but les time is spent debugging memory

  • Classical algorithms

    • Reference counting

      • counts number of references to each object

      • incremental

      • Perl, photoshop, small talk, VB

      • update(right(m), null)

        • recursively decrement all of the things below it, and update anything below again

      • management distributed through execution

      • better locality than tracing

      • cells that are short lived can be reused to reduce the amount of heap usage

      • but fragile, and doesn't deal with cyclic structures

    • Mark sweep

      • dead objects not reclaimed immediately

      • begins when a request is made that results in memory exhaustion

      • useful info is suspended

      • sweep unused objects back to list

      • global traversal of all live objects starting from root

      • sweap from bottom up

      • unmarked cells returned to the free pool

      • marking bits are unset in prep for next gc

      • Start/stop

      • complexity proportional to size of heap

      • more fragmentation/reduced locality

      • requires head-room

    • Copying algorithm

      • split into from and to spaces

      • allocation through pointer mubing in from space

      • live identification is done through tracing

      • live object copied to other side of heap

      • once copied, tags for spaces are flipped

      • fast allocation, simple check for exhaustion

      • no fragmentation

      • poor space utilization

      • more page faults

    • Generational gc

      • the defacto

      • uses multiple kinds

Lecture 11: Process Management

  • difference between programs and processes

  • program is a lifeless object, sits on disk, just instructions

  • a process is a running program

    • OS takes the instructions and gets them running

    • transforms the program into something useful

  • issues with programs running on different OS – because of memory differences, abis etc

  • creation

    • program loaded from disk

    • stored to memory (code and data), stack and heap

    • i/o – create file descriptors

    • execute, starting at entry point (main)

  • System to be initialized

  • user requests a new process

  • initiation of batch jobs

  • be very careful of fork

  • termination conditions

    1. normal exit

    2. error exit

    3. fatal error

    4. killed by another process

  • Parents create children, children can create children

  • forming a hierarchy, called a process group (except on windows)

  • processes may be either running or not running (suspesion is used to wait for other things)

  • has to use a queue – kinda

Lecture 12: Process Management, continued

  • the two process model

  • running or not running

  • running requires availability of cpu resources

  • many state models

  • two state model

    • running, and not running, looped together

    • not running is entry point

    • exit is exit point from running

  • not-running-queue

    • uses a queue for not yet running

  • three state model

    • ready, blocked, running

    • running to blocked, or to ready

    • ready to running

    • blocked to ready

    • lowest layer of a process-structured os

    • handles interrupts and scheduling

    • above the scheduler are sequential processes

    • can preempt, switch, etc

  • scheduler preemption

    • save program counter

    • register values

    • memory, etc

  • five state

    • running, ready, new, blocked, exit

    • new admits to ready,

    • ready dispatches to running

    • running ties out to ready

    • events wait from running o blocked

    • blocked occcurs tto ready

    • running releases to exit

    • two/multiple queues

      • blocked queue

      • ready queue

      • a queue for each event waiting type

  • process suspension

    • processor is faster than i/o

    • swap processes to disk to free memory

    • blocked state becomes suspend when swapped to disk

    • blocked/suspend, ready/suspend

  • process suspension may happen for a variety of reasonns

  • i/o tables

  • file tables

  • status tables

  • process structure points to where various things are

    • registers

    • pc

    • program status

    • stack pointer

    • process stae

    • priority

    • sched params

    • pid

    • parent

    • process group

    • a variety of others

  • shared memory can be used to communicate/coordinate between processes

  • Threading

    • lighter weight than processes

    • but have a similar purpose

    • keep the same process, execution context

    • and only are a specific function

    • can still lead to race conditions

    • can cause issues w/ stack space

Lecture 13: Process Examples

  • virtual addresses, physical addresses must be mapped for shared memory

  • know how many processes are in play

  • remember that process have parent/child relationships

Threading

  • miniature processes – part of one, same resources as the original, but new execution context (new stack space for the thread)

  • may/does require dynamic allocation on the stack

  • multi threading helps to utilize as much of the processor as possible

  • threads can be implemented either in userspace or by the kernel

    • requires scheduling, a thread table, etc

  • kernel threads are now the norm – makes things much easier to handle

    • good concurrency support – blocking thread doesn't starve siblings

    • scheduling needs system calls

    • not as fleixble

    • not as scalable

  • hybrid implementations – part kernel hread, part user threads

  • scheduler activations

    • mimc kernel threads, gaining performance of user threads

    • avoid unnecessary user/kernel transitions

    • kernel assigns virt processor to each process

    • relies on the kernel, to notify of kernel blocking events

    • more virt processors are assigned in blocking

  • mysql

    • popup threads – new threads on message arrival

    • some before, but most after

  • threads must be reentrant

  • may have private globals

    • and static locals

Lecture 14: Threads, Continued

  • create a thread and start routine

  • exit simply terminates a thread

  • pthread_join suspends execution of calling thread until the target thread terminates, unless the target thread has already terminated

  • Be careful with how sync is done.

  • And consider how syncing in threads is done

  • Mutexs, flags, semaphores, etc

Lecture 16: More Concurrency

  • Print Queue – $N = 100$

  • Processes add print jobs to the queue, but can't add more to a full queue

  • Processes remove jobs from the queue and do the printing, bu can't consume from the empty queue

  • Some buffers are circular

  • When buffer is full, producers go to sleep

  • When empty, consumers go to sleep

  • If empty buffer gets added to, wake consumer; if full buffer gets removed, wake producers

  • Updates may, and likely do, require a mutex

  • Semaphores – A number, 0 or a postive number

    • down – if val > 0, decremet, if val = 0, suspend without completing down

    • up – increment and if there are processes sleeping on the semaphore, wake one or all up

  • Monitors

    • Semaphores are used for mutual exclusion

    • barriers and signalling events

    • low level

  • Monitors provide mutual exclusion implicity

Lecture 17: Monitor

  • Semaphores as barriers

  • Monitor is a language construct

  • Only one process can be active in a monitor at anny instant

  • Use condition variables to block processes

  • wait operations

  • signal operations

  • See slide 11 from class

  • produce-consumer problem, only one process can be active at one time

  • buffer has $n$ slots

  • implementation independent

  • only visible effects matter

  • callers are independent

  • monitor writers can change the iplementation as long as tthe same effects are maintained

  • provides mutual exclusion implicitly

  • condition variables

    • delay a process if a monitor state fails to satisfy boolean condition

    • awaken a process or processes if the condition becomes true

  • signaling disciplines

  • signal continue signal wait

  • Helps to solve race conditions

Lecture 18: Deadlock

  • When there's a shared resource tthat you only need a part of

  • many different methods of dealing with deadlock

  • happens on mutual exclusion

  • no preemption

  • circular waiting

  • mutual exclusions hel to avoid race conditions

  • hold-and-wait – must request all resources initially

  • no preemption – unable to do well

  • circular wait – define linear ordering of resource types

  • Most OS use the ostrich's approach – head in sand

  • Use the banker's algorithm to help control allocation/release

  • max requirement staed in advance

  • process must be independen – no synchronization requirement

  • fixed number of resources

  • no process may exit while holding resources

  • detect by

    • marking each process that has a roww of all zeroes

    • initials a tem vector to equal the available vector

    • find an $i$ such that $i$ is unmarked and the $i$th row of $Q \leq W$

      • if does not exist, terminate

      • if exists, mark and add to row of $A$

Lecture 20: Dealing with Interrupts

  • be careful of how synchronus/asynchronous function calls happen

  • interrupts are handled by function pointers

  • execution context

    • alt_exception_entry.s – handles changing execution context, i.e., interrupts

    • stores return address, exception handling exit

    • stores current stack pointer, all registers, etc

    • Loads and stores into the stack

  • Stack pointer points to the last used slot

  • FP points to the last saved frame

  • In multi-threading, you can just go to the next thread on return from a interrupt handler

  • Next up – scheduling algorithms

  • Signal and continue – signaler continues and process executes at some later time

Lecture 21: Uniprocessor Scheduling

  • Assign processes to be executed

  • minimize response time

  • maximize throughput and processor efficiency

  • Long term scheduling – determines which wich are admitted, determines degree of multiprogramming

  • Medium-term – swapping function, manage degree of multiprogramming and available memory

  • short-term scheduling – dispatcher, executes most frequently, when events occur (clock, i/o interrupts, OS calls, signals)

    • user-oriented by response time, which should be minimized

    • throughput for the system

    • turnaround time – interval between submission and completion – batch processing

    • response time – submission and initial response

    • deadline – tife for a process to complete maximize percentage of deadlines met

    • predictability – job should take about the same time to run and incur the same cost on a system regardless of load

  • Multiple queues based on what is being waited on

  • Schedulers often use priorities – low priority may suffer starvation

  • First Come First served – first process

    • Arrival and time Required

    • Look at like a Gant chart

    • Pick oldest in queue to go first

    • Short processes may have to wait a long time before it can execute

  • Round Robin – Premption based on a clock

    • Amount of timedetermined that allows each process to use the processor for that time

    • Time slicing

  • Shortes process next

  • shortest remaining time

  • highest response ratio next – choose based on time waiting plus service time over service time

  • Multilevel feedback queue – penalize jobs that have been running longer, don't know remaining time process needs to execute

Lecture 22: Scheduling

  • Use context library to implement thread scheduler

  • Context should be pretty helpful