MLFL Wiki |
Main /
Graph Lab A Framework For Asynchronous Parallel Machine LearningAbstract: While high-level data-parallel frameworks like MapReduce (Hadoop) dramatically simplify the design of large-scale data processing systems. Unfortunately, many popular machine learning algorithms like belief propagation, Gibbs sampling, CoEM, and the lasso (shooting algorithm) require asynchronous iterative computation and impose sparse parameter dependencies. Recent frameworks like Pregel and Giraph have begun to simplify the design and implementation of large-scale graph structure learning algorithms by implementing the classic Bulk Synchronous computational model. However, none of these frameworks target the more expressive asynchronous computational model needed for efficient learning algorithms. To fill this critical void, we developed the GraphLab framework which naturally expresses asynchronous graph computation while ensuring data consistency and achieving a high degree of parallel performance. In this talk I will demonstrate how the MapReduce and Bulk Synchronous models of computation can lead to highly inefficient parallel learning systems by exploring the Bulk Synchronous model in the context of loopy belief propagation. I will then introduce the GraphLab framework and explain how it addresses these critical limitations while retaining the advantages of a high-level abstraction. I will show how the GraphLab abstraction can be used to build efficient provably correct versions of several popular sequential machine learning algorithms. Finally, I will present results in both the multi-core and cloud settings. This is joint work with: Yucheng Low, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Guy Blelloch, David O'Hallaron, Joseph M. Hellerstein Bio: Joseph Gonzalez is a PhD student working with Carlos Guestrin in the Machine Learning Department at Carnegie Mellon University. His thesis work is on parallel algorithms and abstractions for graph structured machine learning on multicore and cluster architectures. Joseph is a recipient of the AT&T Labs Graduate Fellowship and the NSF Graduate Research Fellowship. |