Our model goes beyond the linear chain CRF to incorporate multiple hidden states per output label, but parametrizes their transitions parsimoniously with low-rank log-potential scoring matrices, effectively learning an embedding space for hidden states. This augmented latent space of inference variables complements the rich feature representation of the RNN, and allows exact global inference obeying complex, learned non-local output constraints.
This paper presents a method that learns embedded representations of latent output structure in sequence data. Our model takes the form of a finite-state machine with a large number of latent states per label (a latent variable CRF), where the state-transition matrix is factorized effectively forming an embedded representation of state-transitions capable of enforcing long-term label dependencies, while supporting exact Viterbi inference over output labels.
In this work we present a method for sequence labeling in which representation learning is applied not only to inputs, but also to output space, in the form of a lightly parameterized transition function between a large number of latent states. We introduce a hidden state variable and learn the model dynamics in the hidden state space rather than the label state space. This relaxes the Markov assumption between output labels and allows the model to learn global constraints.