Machine Learning and Friends Lunch

Models for Labeling and Segmenting Sequence Data

Abstract

Conditional random fields (CRFs) for sequence modeling have several advantages over joint models such as HMMs, including the ability to relax strong independence assumptions made in those models, and the ability to incorporate arbitrary overlapping features. Previous work has focused on linear-chain CRFs, which correspond to finite-state machines, and have efficient exact inference algorithms. Often, however, we wish to represent more complicated interaction between labels---for example, when performing multiple labeling tasks on the same sequence, or when longer-range dependencies exist between labels. We present dynamic conditional random fields (DCRFs), which are CRFs in which each time slice has a set of state variables and edges---a distributed state representation as in dynamic Bayesian networks---and parameters are tied across slices. (They could also be called conditionally-trained Dynamic Markov Networks.) Since exact inference can be intractable in these models, we perform approximate inference using the tree-based reparameterization framework (TRP). We present empirical results on natural-language chunking and information extraction. If there is demand, I can also spend time on a CRF mini-tutorial.

Back to ML Lunch home