I'm a Ph.D. student at UMass Amherst working in the Information Extraction and Synthesis Laboratory with Professor Andrew McCallum. Previously, I earned a B.S. in Computer Science from the University of Maine with a minor in math, where I applied models from mathematical biology to the spread of internet worms with Professor David Hiebeler in his Spatial Population Ecological and Epidemiological Dynamics Lab.
I am interested in developing new machine learning techniques to facilitate fast (and accurate) natural language processing of text.
NLP tasks are commonly modeled as structured prediction problems, in which we learn a mapping from the space of possible inputs to an exponentially large space of labels. Since classifying into the exponentially large space of all possible structures is intractable, we need to decompose the structure in a way that allows us to perform more efficient inference.
One way to do this is by defining the variables that describe the structure and a sparse set of dependencies between them, which results in familiar graphical models such as HMMs or CRFs. An alternative approach is to organize the space of possible classes into a structure that allows for more efficient classification, resulting in classification decisions that do not directly correspond to variable boundaries, but some other latent structure in the space of classes, such as a low-rank neural embedding.
I am interested in the latter approach, which I believe will allow for fast inference in conjunction with the accuracy gains from using large, joint output spaces.
- Fast and Accurate Sequence Labeling with Iterated Dilated Convolutions. Emma Strubell, Patrick Verga, David Belanger, and Andrew McCallum. ArXiv preprint. 2017.
- An epidemiological model of internet worms with hierarchical dispersal and spatial clustering of hosts. David E. Hiebeler, Andrew Audibert, Emma Strubell and Isaac J. Michaud. Journal of Theoretical Biology. 418: 8--15. 2017. [bibtex]
- Multilingual Relation Extraction using Compositional Universal Schema. Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth and Andrew McCallum. Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT). San Diego, California. June 2016. [bibtex] [code]
- Learning Dynamic Feature Selection for Fast Sequential Prediction. Emma Strubell, Luke Vilnis, Kate Silverstein and Andrew McCallum. Annual Meeting of the Association for Computational Linguistics (ACL). Beijing, China. July 2015. Outstanding paper award. [video] [slides] [poster] [bibtex]
- Training for Fast Sequential Prediction Using Dynamic Feature Selection. Emma Strubell, Luke Vilnis, and Andrew McCallum. NIPS Workshop on Modern Machine Learning and NLP (NIPS WS). Montreal, Quebec, Canada. December 2014. [bibtex]
- Minimally Supervised Event Argument Extraction using Universal Schema. Benjamin Roth, Emma Strubell, Katherine Silverstein and Andrew McCallum. 4th Workshop on Automated Knowledge Base Construction (AKBC). At NIPS '14, Montreal, Quebec, Canada. December 2014. [bibtex]
- Universal Schema for Slot-Filling, Cold-Start KBP and Event Argument Extraction: UMassIESL at TAC KBP 2014. Benjamin Roth, Emma Strubell, John Sullivan, Lakshmi Vikraman, Katherine Silverstein, and Andrew McCallum. Text Analysis Conference (Knowledge Base Population Track) '14 Workshop (TAC KBP). Gaithersburg, Maryland, USA. November 2014. [bibtex]
- Modeling the Spread of Biologically-Inspired Internet Worms. Emma Strubell. Undergraduate honors thesis. University of Maine Honors College, Orono, Maine, USA. May 2012. [bibtex]
In my spare time, I enjoy cooking (with a focus on making vegetables delicious), fermenting (kombucha, saurkraut, kimchi), growing plants (especially succulents), and enjoying the outdoors (hiking and camping).
In search of a fast Scala lexer, I forked JFlex and added the ability to emit Scala code. JFlex-scala, and its corresponding maven and sbt plugins, are available on Maven Central. For an example of its use, check out the tokenizer in FACTORIE.
I am a proud and happy Gentoo Linux user since 2005.
Amherst, Massachusetts, USA
strubell [at] cs [dot] umass [dot] edu