Spectral Learning Of Lexical Representations In Natural Language Processing
There has recently been much success in deriving rich, distributional representations of words from large quantities of unlabeled text. They include discrete representations such as agglomerative clusters (e.g., Brown clusters) and real-valued vectors such as word embeddings (e.g., word2vec). These lexical representations can be deployed off-the-shelf in a wide range of language processing tasks to help the model generalize at the word level.
In this talk, I will present a simple and efficient algorithm for learning such representations. The algorithm is spectral---i.e., it involves the use of singular value decomposition (SVD), and it comes with a theoretical guarantee of recovering the underlying model given enough data from the model. In addition, we find that our algorithm can be much more scalable than other methods in practice. For example, it can be up to 10x faster than the Brown clustering algorithm in wall-clock time while delivering competitive lexical representations.
Karl Stratos is a PhD student at Columbia University, advised by Michael Collins. He is broadly interested in machine learning techniques and their applications in natural language processing. His recent focus has been on spectral algorithms, representation learning, and structured prediction. One of his research aims is to develop practical approaches to leveraging enormous amounts of unlabeled data.