University of Massachusetts


Latent Variable Models Of Distributional Lexical Semantics

Abstract: In order to respond to increasing demand for natural language interfaces---and provide meaningful insight into user query intent---fast, scalable lexical semantic models with flexible representations are needed. Human concept organization is a rich phenomenon that has yet to be accounted for by a single coherent psychological framework: Concept generalization is captured by a mixture of prototype and exemplar models, and local taxonomic information is available through multiple overlapping organizational systems. Previous work in computational linguistics on extracting lexical semantic information from the Web does not provide adequate representational flexibility and hence fails to capture the full extent of human conceptual knowledge. In this talk I will outline two probabilistic models that can account for some of the rich organizational structure found in human language: (1) a background clustering model of polysemy and (2) a hierarchical LDA-based approach to modeling concept organization. These models can be used to predict contextual variation, selectional preference and feature-saliency norms to a much higher degree of accuracy than previous approaches, and have the potential for improving question answering, text classification, machine translation, and information retrieval.

Bio: Joe Reisinger is a PhD candidate in the Computer Science at the University of Texas at Austin. His research interests include large-scale latent variable modeling, structured information extraction, lexical semantics and econometric modeling. Joe was the recipient of the 2010 Google Research Fellowship in NLP and previously held an NSF Graduate Research Fellowship. Prior to joining UT, he worked at IBM T.J. Watson Research Center and IBM Yamato, and more recently has completed several internships at Google Research in Mountain View.

Page last modified on April 01, 2012, at 12:45 AM