A fundamental building block of Peer-to-Peer networks is a distributed hash table (DHT): a hash table where the keys are partitioned across a set of processors. In this talk, we analyze a DHT based on the Content Addressable Network DHT. We demonstrate that in our DHT, the number of queries required to find a key in the table is $\log_2 n+O(1)$, the number of pointers each processor maintains is $\log_2 n+O(1)$, and during a sequence of $n$ processor arrivals, the ratio between the maximum load of a processor and the minimum load of a processor is always $O(1)$ (with high probability). To the best of our knowledge, this is the first analysis of a DHT that demonstrates asymptotically optimal load balance, while still only requiring $O(\log n)$ pointers per processor, and $O(\log n)$ queries per location.
Our analysis reduces to the following stochastic process executed on a hypercube with $n$ vertices. Initially, the nodes of the hypercube are uncovered. In each step, pick a node at random and if it is uncovered, cover it. Otherwise, if it has any uncovered neighbors, cover a random uncovered neighbor. This can be viewed as a structured coupon collector process. We demonstrate that $O(n)$ steps suffice to cover all nodes of the hypercube with high probability. Furthermore, our proof extends to a large family of graphs, including $d$-regular graphs where $d =\Omega( \log n \log \log n)$, and random $d$ regular graphs with $d = \Omega(\log n)$.
Joint work with Eran Halperin, Richard Karp and Vijay Vazirani
Last modified 18 February 2003