CMPSCI Theory Seminar

A Stochastic Process on the Hypercube with Applications to Peer-to-Peer Networks.

Micah Adler, UMass Amherst

25 February 2003

4:00 p.m., Room 140 Computer Science Building

A fundamental building block of Peer-to-Peer networks is a distributed hash table (DHT): a hash table where the keys are partitioned across a set of processors. In this talk, we analyze a DHT based on the Content Addressable Network DHT. We demonstrate that in our DHT, the number of queries required to find a key in the table is $\log_2 n+O(1)$, the number of pointers each processor maintains is $\log_2 n+O(1)$, and during a sequence of $n$ processor arrivals, the ratio between the maximum load of a processor and the minimum load of a processor is always $O(1)$ (with high probability). To the best of our knowledge, this is the first analysis of a DHT that demonstrates asymptotically optimal load balance, while still only requiring $O(\log n)$ pointers per processor, and $O(\log n)$ queries per location.

Our analysis reduces to the following stochastic process executed on a hypercube with $n$ vertices. Initially, the nodes of the hypercube are uncovered. In each step, pick a node at random and if it is uncovered, cover it. Otherwise, if it has any uncovered neighbors, cover a random uncovered neighbor. This can be viewed as a structured coupon collector process. We demonstrate that $O(n)$ steps suffice to cover all nodes of the hypercube with high probability. Furthermore, our proof extends to a large family of graphs, including $d$-regular graphs where $d =\Omega( \log n \log \log n)$, and random $d$ regular graphs with $d = \Omega(\log n)$.

Joint work with Eran Halperin, Richard Karp and Vijay Vazirani

Last modified 18 February 2003