UMass Machine Learning and Friends Lunch | Main / A Latent Gaussian Model For Text

Abstract

NLP researchers are often struggle with the limited amounts annotated data available for training their models. On the other hand, massive quantities of raw text data are available from archives of news outlets, digitized books, the internet, etc. Consequently, a trend in NLP is to perform the following two-stage semi-supervised learning procedure: (1) learn some unsupervised model on a very large corpus that passes inputs through some low dimensional bottleneck, (2) map the annotated training examples to a low dimensional representation using the first model, (3) train a supervised predictive model in the reduced dimensional space. Models trained in this way perform well because in general it is easier to train high quality models on lower dimensional data.

I will discuss my current work on a version of step (1) above, based on linear dynamical systems. First, I will provide motivation for such a latent Gaussian model for text and how to apply it efficiently in step (2). The learning algorithm I will present, based on a two-stage estimation procedure of the method of moments followed by EM, is extremely scalable. Namely, after collecting some simple co-occurrence statistics from a large corpus, which can be parallelized, the algorithm's cost is independent of the amount of training data. The talk will conclude with a discussion of the benefits of extending the model using a Gaussian Copula process rather than a linear Gaussian process.

Bio

I am a fourth year graduate student advised by Professor Andrew McCallum. Before that, I was an Associate Scientist in the Speech, Language, and Multimedia Department at Raytheon BBN Technologies, where I worked on multilingual optical handwriting recognition. I received a B.A. in mathematics from Harvard University, where I worked with Eric Dunham and Jim Rice. We developed methods for numerically simulating earthquake ruptures along rough fault surfaces. Currently, my research focus is on machine learning and natural language processing. This summer, I am interning with Sham Kakade at Microsoft Research New England.