MLFL Wiki |
Main /
Graph Lab A Framework For Asynchronous Parallel Machine LearninAbstract: While high-level data-parallel frameworks like MapReduce (Hadoop) dramatically simplify the design of large-scale data processing systems, they are not well-suited for graph structured problems. Unfortunately, many popular machine learning algorithms like belief propagation, Gibbs sampling, CoEM, and the lasso (shooting algorithm) iteratively transform graph structured data and model parameters. Recent frameworks like Pregel and Giraph have begun to simplify the design and implementation of machine learning algorithms for large-scale graph structured problems by implementing the classic Bulk Synchronous computational model. However, none of these frameworks target the more expressive asynchronous computational model needed for efficient algorithms. To fill this critical void, we developed the GraphLab framework which naturally expresses asynchronous graph computation while ensuring data consistency and achieving a high degree of parallel performance for machine learning algorithms. In this talk I will demonstrate how the MapReduce and Bulk Synchronous models of computation can lead to highly inefficient parallel learning systems by exploring these models in the context of loopy belief propagation. I will then introduce the GraphLab framework and explain how it addresses these critical limitations while retaining the advantages of a high-level abstraction. I will show how the GraphLab abstraction can be used to build efficient and provably correct versions of several popular sequential machine learning algorithms. Finally, I will present results in both the multi-core and cloud settings. This is joint work with: Yucheng Low, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Guy Blelloch, David O'Hallaron, Joseph M. Hellerstein Bio: Joseph Gonzalez is a PhD student working with Carlos Guestrin in the Machine Learning Department at Carnegie Mellon University. His thesis work is on parallel algorithms and abstractions for graph structured machine learning on multicore and cluster architectures. Joseph is a recipient of the AT&T Labs Graduate Fellowship and the NSF Graduate Research Fellowship. |