Emery Berger

Syllabus

Introduction. Class information and policies. What's an OS? Virtual machine, etc. What this class is really about: large-scale software systems (think Google / YouTube). Goals: high-performance, reliability, scale. Overview of class, based around these goals (architectural details that matter, C and C++, memory management, concurrency, disks, networks, file systems, server architectures, distributed file systems and computing). Brief historical overview of evolution of systems.

Architecture. Hardware support for applications and operating systems: the memory hierarchy (registers, caches, TLB). Associativity. Classification of misses (compulsory, conflict, etc.). Locality (temporal and spatial). Quantifying locality with LRU hit curves (drawn from histograms). Superscalar architectures: pipelines, speculation, branch prediction.

Intro to C++. Why C/C++? Low-level, no runtime system, efficient. Basics. Pointers and reference variables. C arrays. The difference between the stack and the heap. Parameter passing, C and C++ style. Structs, classes. Overloading and inheritance. I/O. Templates and the C++ Standard Template Library (STL). Pitfalls of C and C++.

Dynamic Memory Management. How heap management functions are implemented (malloc and free). Goals of an ideal memory manager: speed, memory efficiency (low fragmentation), scalability, plus security and reliability. The rest of the malloc API (realloc, calloc). External vs. internal fragmentation. Classical algorithms: first-fit, best-fit, worst-fit. Splitting and coalescing. Implementation techniques: freelists, size classes, segregated size classes, BiBOP. Custom allocators: per-class, stack-like, regions. Benefits and drawbacks. Memory management errors: double frees, invalid frees, dangling pointer errors, buffer overflows. DieHard overview.

Garbage Collection. Garbage collection versus explicit memory allocation (malloc, free). Why GC is superior for software engineering purposes. Explanation of terms: roots, reachability, liveness, tracing, copying, conservative collection, parallel versus concurrent collection. Relative performance and space impact in theory. Classical GC algorithms: mark-sweep, reference counting, semispace. Pointer bumping. Generational garbage collection. Discussion of empirical evaluation of space-time tradeoffs.

Virtual Memory. The difference between virtual and physical memory: using RAM as a cache for the disk. Working set and thrashing. Pages, page tables, and the TLB. How virtual memory simplifies programming, provides isolation and protection, and permits space optimization. Eviction and pre-paging. Swap space.

Virtual Memory and Paging. A day in the life of a page (malloc, use, eviction, reuse). Eviction policies: OPT, LRU, FIFO, MRU. Locality, stack algorithms, Belady's anomaly. Why and how we approximate LRU: the clock algorithm. Segmented queues. Global LRU versus working set. Overview of CRAMM system; concrete definition of working set size; on-line generation of hit curves.

Processes and Threads. Definition of processes. fork() and exec(). Effect on page table - copy-on-write. IPC: signals, pipes, sockets, mmap. Threads and comparison to processes. pthread_create() and pthread_join(). Thread stacks. Thread-specific data. Bakeoff of processes versus threads (flexibility, robustness). Overview of scheduling: context switch overhead, TLB shootdown, cache warmup.

Synchronization. Why threads need synchronization: consistency, race conditions, synchronization operations. "Too much milk." Safety and progress. Atomic reads and writes. Busy waiting. Mutexes, critical sections, locks (+ POSIX API). Implementing synchronization operations: disabling interrupts, test and set.

Advanced Synchronization. Using synchronization both for safety and coordination (parallel computation and event notification). Serialization. Exercise using threads and locks. Implementing locks. Blocking, spinning, hybrids with exponential backoff. Common locking errors and how to prevent them. Scoped locks, recursive locks. Reader-writer locks. Semaphores and condition variables. Deadlock. Using canonical ordering of locks to prevent deadlock.

Concurrency architectures. Writing a concurrent server. Overview of concurrency patterns with benefits and drawbacks: threads, thread pools, producer-consumer (pipeline parallelism), bag of tasks (contention), work queues, worker threads, work stealing (load balancing).

Servers and Queuing Systems. Modeling servers as queuing networks. Little's Law, latency, throughput, identifying performance bottlenecks. Building servers using the Flux language.

Distributed Parallel Programming. Programming large-scale, parallel systems. SPMD model. Basic use of MPI (point-to-point and broadcast).

Networks. Socket programming. TCP/IP versus UDP.

File Systems. File system implementation, inodes, file metadata. Hard versus soft (symbolic) links. Permissions. Handling temporary files (unlink). Other use of file system abstraction, including pipes (fifos), /proc file system. Exploiting common FS locality optimizations. Implementation and user impact of distributed file systems. Naming, caching, consistency management. NFS versus Windows (SMB) versus URLs.

Fault Tolerance. Redundancy and coding. RAID levels, checksums, error-correcting codes.

Security. Malware of all varieties. Exploits - heap attacks and stack overflows, API misuse, TOCTTOU, SQL injection, cross-site scripting. Mitigation and detection via taint analysis (information flow), bounds checks, DieHard, static analysis tools.

Navigation

Syllabus