UMass Machine Learning and Friends Lunch | Main / Scalable Structured Learning With Inexact Inference And Parallelization

Abstract: Most existing structured learning algorithms require exact inference on each training example, which is rarely tractable in NLP tasks such as parsing and machine translation. In practice, we often resort to approximate search, but how would this approximation affect learning? And how should we modify the learning algorithm in order to accommodate (often severe) search errors?

This work develops a general theory of structured perceptron learning under inexact inference. We aim to train a search-specific, search-error-robust model that can "live with" search errors and reach the true output regardless of how inaccurate the search is. Specifically, we propose variants of the structured perceptron algorithm under a general "violation-fixing" framework that guarantees convergence (under new definitions of separability). This framework subsumes previous remedies including ``early update'' of Collins and Roark (2004) as special cases (and thus providing the first theoretical justification for it), and also explains why standard perceptron often fail with inexact search. We also propose new update methods within this framework which learn better models with dramatically reduced training times on state-of-the-art part-of-speech tagging and incremental parsing systems.

While the above work allows the use of very fast approximate search, the amount of training data still makes online learning too slow in practice. I will present a new parallelization algorithm for online learning (Perceptron and MIRA) based on a very simple minibatch idea, which leads to much higher speedup than previous parallelization techniques (such as McDonald et al (2010)).

Bio: Liang Huang is an Assistant Professor at The City University of New York (CUNY). Previously he was Research Assistant Professor at University of Southern California (USC), and Research Scientist at Google. He received his PhD from the University of Pennsylvania in 2008. His research focuses on efficient search algorithms for natural language processing, esp. in parsing and machine translation, as well as related structured learning problems and their applications to non-language domains such as biological sequences. His work received a Best Paper Award at ACL 2008, and three Best Paper Nominations at ACL 2007, EMNLP 2008, and ACL 2010.