Emery Berger - publications

OSDI 2006: CRAMM: Virtual Memory Support for Garbage-Collected Applications (to appear)
with Ting Yang, Scott F. Kaplan, and J. Eliot B. Moss. (draft version)
A virtual memory manager that, combined with a collector-neutral heap sizing algorithm, ensures that garbage-collected applications run as fast as possible while avoiding paging.
USENIX 2006: Flux: A Language for Programming High-Performance Servers
with Brendan Burns, Kevin Grimaldi, Alex Kostadinov, and Mark Corner.
Flux is a concise programming language for writing servers that scale and are deadlock-free. A Flux programmer uses off-the-shelf, sequential C and C++ code, and describes their composition; the Flux compiler then generates a deadlock-free, high-concurrency server. Flux also makes it easy to analyze and predict server performance, because the Flux compiler can also generate discrete event simulators that reflect actual server performance.
You can
download Flux and see the talk: PowerPoint, PDF.
USENIX 2006: Transparent Contribution of Memory
with James Cipar and Mark Corner
Introduces transparent memory management (TMM), which lets you run background jobs that use your disk and virtual memory without impacting your use of the machine, even after you leave your machine unattended for an extended period of time. You can download TMM.
PLDI 2006: DieHard: Probabilistic Memory Safety for Unsafe Languages
with Ben Zorn.
DieHard uses randomization and replication to transparently make C and C++ programs tolerate a wide range of errors, including buffer overflows and dangling pointers. Instead of crashing or running amok, DieHard lets programs continue to run correctly in the face of memory errors with high probability. Using DieHard also makes programs highly resistant to heap-based hacker attacks.
You can
download DieHard, and see the talk: PowerPoint, PDF.

OOPSLA 2005: Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
with Matthew Hertz.
This paper attempts to answer an age-old question: is garbage collection faster/slower/the same speed as malloc/free? We introduce oracular memory management, an approach that lets us measure unaltered Java programs as if they used malloc and free. The result: a good GC can match the performance of a good allocator, but it takes 5X more space. If physical memory is tight, however, conventional garbage collectors suffer an order-of-magnitude performance penalty.
PowerPoint, PDF.

MSP 2005: A Locality-Improving Dynamic Memory Allocator
with Yi Feng.
Presents Vam, a memory allocator that improves cache-level and virtual memory locality. Vam is distributed with Heap Layers. Talk (PowerPoint).

PLDI 2005: Garbage Collection without Paging
with Matthew Hertz and Yi Feng.
Introduces bookmarking collection, a GC algorithm that works with the virtual memory manager to eliminate paging. Just before memory is paged out, the collector "bookmarks" the targets of pointers from the pages. Using these bookmarks, BC can perform full garbage collections without loading the pages back from disk. By performing in-memory garbage collections, BC can speed up Java programs by orders of magnitude (up to 41X). Download the bookmarking collector and associated Linux patches. Talk (PowerPoint).

OOPSLA 2004: MC2: High-Performance Garbage Collection for Memory-Constrained Environments
with Naren Sachindran and Eliot Moss.
MC2 is an incremental, space-efficient garbage collector that has high throughput and low pause times.

ISMM 2004: Automatic Heap Sizing: Taking Real Memory into Account
with Ting Yang, Matthew Hertz, Scott Kaplan, and Eliot Moss.
A GC-independent approach that cooperates with an enhanced virtual memory manager to dynamically pick the best heap size while a program is running. Talk (PowerPoint). This work is subsumed by our OSDI 2006 paper, above.

OOPSLA 2002: Reconsidering Custom Memory Allocation
with Ben Zorn & Kathryn McKinley
Finds that a good general-purpose allocator is better than all custom allocators except regions, but these have serious problems. Introduces reaps (regions + heaps), which combine the flexibility and space efficiency of heaps with the performance of regions. Talk (PowerPoint).

PLDI 2001: Composing High-Performance Memory Allocators
with Ben Zorn & Kathryn McKinley
Introduces Heap Layers, a flexible infrastructure for building memory allocators that leverages C++ template mixins to achieve high performance. Talk (PowerPoint), Heap Layers source.

ASPLOS-IX: Hoard: A Scalable Memory Allocator for Multithreaded Applications
with Kathryn McKinley, Robert Blumofe, & Paul Wilson
Identifies problems of heap contention, space blowup, and allocator-induced false sharing in previous allocators; introduces Hoard, a fast memory allocator that solves these problems.
Talk (PowerPoint), Hoard home page.

Memory Management for High-Performance Applications
Department of Computer Sciences, The University of Texas at Austin (2002, TR02-52).
nominated for ACM best dissertation award

10th SIAM Conference on Parallel Processing for Scientific Computing:
Customizing Software Libraries for Performance Portability
with Sam Guyer & Calvin Lin

WCBC 99 (The 1999 Workshop on Cluster-Based Computing):
Scalable Load Distribution and Load Balancing for Dynamic Parallel Programs
with James C. Browne

Detecting Errors with Whole-Program Configurable Dataflow Analysis :
UTCS Technical Report TR-02-04
with Sam Guyer & Calvin Lin

IJHPCA 2000 (Int'l Journal of High-Performance Computing Applications, Winter 2000):
Compositional Development of Performance Models in POEMS
with J.C. Browne & A. Dube

International Journal for Numerical Methods in Engineering, Volume 42, 1998:
A Fast Solution Method for Three-Dimensional
Many-Particle Problems of Linear Elasticity

Yuhong Fu, Kenneth Klimkowski, Greg Rodin, Emery Berger,
J.C. Browne, Jurgen Singer, Robert van de Geijn, and Kumar Vemaganti