Machine Learning and Friends Lunch

Conditional models for natural language dependency parsing

Abstract

In the natural language processing community almost all state of the art parsing models are generative. The primary reason for this is most likely the O(n^3) parsing complexity required to run Viterbi or the forward-backward algorithm on training instances. This problem is exacerbated when we consider lexicalized grammars such as lexicalized-CFGs and dependency grammars, which have a worst-case runtime of O(n^5). Since all current state-of-the-art parsing models are lexicalized and global conditional models like conditional random fields and the voted perceptron require either Viterbi or forward-backward, this poses a substantial problem.

It should be noted that work has been done on conditional parsing such as the maximum entropy parsing model of Ratnaparkhi or the SVM dependency parser of Yamada and Matsumoto. However, both these models are locally trained and may suffer from the label bias problem. Recently, Taskar et al. presented a kernel based globally trained phrase structure parsing model. However, currently the model is improperly lexicalized, can only train with sentences less than 15 words and is unable to significantly outperform the Collins parsing model.

In this talk I will present some preliminary investigations into solving the main problems for parsing dependency structures. I will also include some promising results for both english and czech that are competitive with current state-of-the-art generative models. Finally I will wrap up the talk by discussing the benefits of improved dependency parsing for information extraction and machine translation as a beginning for future work.

References:

Miyao, Yusuke and Jun'ichi Tsujii. (2002). Maximum Entropy Estimation for Feature Forests. In the Proceedings of Human Language Technology Conference (HLT 2002).
Michael Collins. Head-Driven Statistical Models for Natural Language Parsing. PhD Dissertation, University of Pennsylvania, 1999.
Michael Collins, Jan Hajic, Lance Ramshaw and Christoph Tillmann. 1999. A Statistical Parser for Czech. ACL 99.
Statistical Dependency Analysis with Support Vector Machines, Hiroyasu Yamada, Yuji Matsumoto. IWPT 2003.
Adwait Ratnaparkhi. (1999). Learning to Parse Natural Language with Maximum Entropy Models. Machine Learning, 34, 151--175.
Ben Taskar, Dan Klein, Michael Collins, Daphne Koller, and Christopher Manning. Max-Margin Parsing. EMNLP 2004
Aron Culotta and Jeffrey Sorensen. Dependency tree kernels for relation extraction. ACL 2004. Barcelona, Spain

Back to ML Lunch home