Selected Publications


Bi-directional Entity to Text Attention for Knowledge Informed Text Representations

Trapit Bansal, Dung Thai, Raghuveer Thirukovalluru, Andrew McCallum

in submission to EACL [Code][PDF]


Using BibTeX to Automatically Generate Labeled Data for Citation Field Extraction

Dung Thai, Zhiyang Xu, Nicholas Monath, Boris Veytsman, Andrew McCallum

AKBC 2020 [Code][PDF]

We proposed a data generation framework to automatically generate reference strings and showed that using this data helps improve the performance of CFE benchmarks.


Embedded-State Latent Conditional Random Fields for Sequence Labeling

Dung Thai, Sree Harsha Ramesh, Shikhar Murty, Luke Vilnis, Andrew McCallum

CoNLL 2018 [Code][PDF]

Our model goes beyond the linear chain CRF to incorporate multiple hidden states per output label, but parametrizes their transitions parsimoniously with low-rank log-potential scoring matrices, effectively learning an embedding space for hidden states. This augmented latent space of inference variables complements the rich feature representation of the RNN, and allows exact global inference obeying complex, learned non-local output constraints.


Low-rank hidden state embeddings for Viterbi sequence labeling

Dung Thai, Shikhar Murty, Trapit Bansal, Luke Vilnis, David Belanger, Andrew McCallum

DeepStruct Workshop, ICML 2017 [PDF]

This paper presents a method that learns embedded representations of latent output structure in sequence data. Our model takes the form of a finite-state machine with a large number of latent states per label (a latent variable CRF), where the state-transition matrix is factorized effectively forming an embedded representation of state-transitions capable of enforcing long-term label dependencies, while supporting exact Viterbi inference over output labels.


Projects

Factorized Latent Conditional Random Fields

In this work we present a method for sequence labeling in which representation learning is applied not only to inputs, but also to output space, in the form of a lightly parameterized transition function between a large number of latent states. We introduce a hidden state variable and learn the model dynamics in the hidden state space rather than the label state space. This relaxes the Markov assumption between output labels and allows the model to learn global constraints.

Curriculum Vitae