UMass Machine Learning and Friends Lunch | Main / Structure Representations With Feature-rich Compositional Embedding Models

Abstract

Word embeddings, a popular type of distributed word representations learned by neural language models, have become a popular way to handle the data sparsity problem in Natural Language Processing (NLP). However, word embeddings are not sufficient for capturing structure information in languages, which are critical for many NLP tasks. This talk focuses on composing structure representations via word embeddings and a minimal set of non-lexical features. Specifically, we proposed the Feature-rich Compositional Embedding Models (FCM), which learns to build specific transformation for each word according to its contextual information, and uses the transformations to derive the structure embeddings from all the component word embeddings.

Furthermore, we show that above model can be viewed as representing features as tensors, which can be further improved with low-rank tensor approximations and by joint-training on labeled and unlabeled data. Experiments, including (1) relation extraction on ACE2005, (2) PP-attachment on WSJ and (3) phrase similarity on PPDB, show that the proposed methods can out-perform both the other feature learning models based on deep networks or word embeddings, and the traditional log-linear models based on feature engineering.

Bio

Mo Yu is now a visiting student at Johns Hopkins University and a fourth-year Ph.D. student in the Department of Computer Science at Harbin Institute of Technology. He is now working with Professor Mark Dredze and Professor Raman Arora on Representation Learning methods for NLP tasks such as relation extraction, syntactic parsing and semantic compositions.

Prior to JHU, he was visiting Microsoft Research Asia in the year 2008 and 2010, working on Content-based Image Retrieval (CBIR) and target-dependent sentiment analysis, respectively, and was visiting Baidu NLP group during the year 2012-2013, working on Deep Learning for Natural Language Processing and Dependency Parsing.