|
Publications
2020
2019
- Search-Guided, Lightly-Supervised Training of Structured Prediction Energy Networks. Pedram Rooshenas, Dongxu Zhang, Gopal Sharma, Andrew McCallum. Proceedings of Neural Information Processing Systems (NeurIPS) 2019.
- Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering. Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum. EMNLP-IJCNLP Workshop on Machine Reading for Question Answering (MRQA, Best Paper Award), 2019.
- Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering. Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum. ArXiv preprint arXiv:1909.07598, 2019.
- Scalable Hierarchical Clustering with Tree Grafting. Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael R Glass, Andrew McCallum. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2019.
- Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2019.
- Paper Matching with Local Fairness Constraints. Ari Kobren, Barna Saha, Andrew McCallum. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2019.
- Smoothing the Geometry of Probabilistic Box Embeddings.. Xiang Li*, Luke Vilnis*, Dongxu Zhang, Michael Boratko and Andrew McCallum. International Conference on Learning Representations (ICLR) Oral presentation, 2019.
- Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering. Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum, International Conference on Learning Representations (ICLR) 2019.
- Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension. Rajarshi Das, Tsendsuren Munkhdalai, Eric Xingdi Yuan, Adam Trischler, Andrew McCallum. International Conference on Learning Representations (ICLR) 2019.
- Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders . Andrew Drozdov*, Patrick Verga* , Mohit Yadav*, Mohit Iyyer, and Andrew McCallum. Association of Computational Linguistics (ACL), 2019.
- Optimal Transport-based Alignment of Learned Character Representations for String Similarity. Derek Tam, Nicholas Monath, Ari Kobren, Aaron Traylor, Rajarshi Das, Andrew McCallum. Association of Computational Linguistics (ACL), 2019.
- A2N: Attending to Neighbors for Knowledge Graph Inference. Trapit Bansal, Da-Cheng Juan, Sujith Ravi, Andrew McCallum. Association of Computational Linguistics (ACL), 2019.
- Energy and Policy Considerations for Deep Learning in NLP. Emma Strubell, Ananya Ganesh and Andrew McCallum. Association of Computational Linguistics (ACL), 2019.
- Supervised Hierarchical Clustering with Exponential Linkage. Nishant Yadav, Ari Kobren, Nichonas Monath, Andrew McCallum. International Conference on Machine Learning (ICML), 2019.
- Integrating User Feedback under Identity Uncertainty in Knowledge Base Construction. Ari Kobren, Nicholas Monath, Andrew McCallum. Automated Knowledge Base Construction (AKBC), 2019.
- The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures. Sheshera Mysore, Zach Jensen, Edward Kim, Kevin Huang, Haw-Shiuan Chang, Emma Strubell, Jeffrey Flanigan, Andrew McCallum, Elsa Olivetti. LAW XIII 2019: The 13th Linguistic Annotation Workshop (ACL WS), 2019.
- Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks. Edward Kim, Zach Jensen, Alexander van Grootel, Kevin Huang, Matthew Staib, Sheshera Mysore, Haw-Shiuan Chang, Emma Strubell, Andrew McCallum, Stefanie Jegelka, and Elsa Olivetti. arXiv pre-print 1901.00032, in submission, 2019.
2018
- Compact Representation of Uncertainty in Clustering. Craig Greenberg, Nicholas Monath, Ari Kobren, Patrick Flaherty, Andrew McGregor, Andrew McCallum. Neural Information Processing Systems (NIPS), 2018.
- Embedded-State Latent Conditional Random Fields for Sequence Labeling. Dung Thai, Sree Harsha Ramesh, Shikhar Murty, Luke Vilnis and Andrew McCallum. Conference on Computational Natural Language Learning (CoNLL), 2018.
- Linguistically-Informed Self-Attention for Semantic Role Labeling. Emma Strubell, Patrick Verga, Daniel Andor, David Weiss and Andrew McCallum. Conference on Empirical Methods in Natural Language Processing (EMNLP, Best long paper award). Brussels, Belgium. October 2018.
- Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets. Nathan Greenberg, Trapit Banasl, Patrick Verga , and Andrew McCallum. Conference on Empirical Methods in Natural Language Processing (EMNLP short). Brussels, Belgium. October 2018.
- Efficient Graph-based Word Sense Induction by Distributional Inclusion Vector Embeddings. Haw-Shiuan Chang, Amol Agrawal, AAnanya Ganesh, AAnirudha Desai, Vinayak Mathur, Alfred Hough, and Andrew McCallum.
TextGraphs-12: the Workshop on Graph-based Methods for Natural Language Processing, (NAACL HLT WS), 2018.
- A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset. Michael Boratko, Harshit Padigela, Divyendra Mikkilineni, Pritish Yuvraj, Rajarshi Das, Andrew McCallum, Maria Chang, Achille Fokoue-Nkoutche, Pavan Kapanipathi, Nicholas Mattei, Ryan Musa, Kartik Talamadupula, Michael Witbrock. Association for Computational Linguistics Workshop on Machine Reading for Question Answering (ACL WS, Best paper award) 2018.
- Syntax Helps ELMo Understand Semantics: Is Syntax Still Relevant in a Deep Neural Architecture for SRL? Emma Strubell and Andrew McCallum. Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP (ACL WS). Melbourne, Australia. July 2018.
- Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures. Luke Vilnis*, Xiang Lorraine Li*, Shikhar Murty, Andrew McCallum. Annual Meeting of the Association for Computational Linguistics (ACL) 2018.
- Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking. Shikhar Murty*, Patrick Verga*, Luke Vilnis, Irena Radovanovic and Andrew McCallum. The 56th Annual Meeting of the Association for Computational Linguistics oral presentation (ACL) 2018.
- Learning Conditionally Calibrated Equations of State for Direct Fired sCO2 Cycles with Deep Neural Networks. Luke Vilnis, David Freed, Navid Rafati, Joe Camilo, Andrew McCallum. The 6th International Supercritical CO2 Power Cycles Symposium (sCO2), 2018
- Training Structured Prediction Energy Networks with Indirect Supervision Amirmohammad Rooshenas, Aishwarya Kamath, Andrew McCallum. In Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT NAACL) 2018.
- Go for a Walk and Arrive at the Answer: Reasoning Over Knowledge Bases with Reinforcement Learning. Rajarshi Das*, Shehzaad Dhuliawala*, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola and Andrew McCallum. International Conference on Learning Representations (ICLR) 2018.
- Simultaneously Self-attending to All Mentions for Full-Abstract Biological Relation Extraction. Patrick Verga, Emma Strubell and Andrew McCallum. Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT) 2018.
- Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection. Haw-Shiuan Chang, ZiYun Wang, Luke Vilnis, Andrew McCallum. Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (NAACL HLT) 2018.
2017
- Go for a Walk and Arrive at the Answer: Reasoning Over Knowledge Bases with Reinforcement Learning. (Workshop Version, see also ICLR 2018 conference paper.) Rajarshi Das*, Shehzaad Dhuliawala*, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola and Andrew McCallum. Neural Information Processing Systems Workshop on Automated Knowledge Base Construction (AKBC NIPS WS, Best paper award) 2017.
- Finer Grained Entity Typing with TypeNet. Shikhar Murty, Patrick Verga , Luke Vilnis, and Andrew McCallum. 6th Workshop on Automated Knowledge Base Construction (AKBC NIPS WS) 2017.
- Automatically Extracting Action Graphs From Materials Science Synthesis Procedures. Sheshera Mysore, Edward Kim, Emma Strubell, Ao Liu, Haw-Shiuan Chang, Srikrishna Kompella, Kevin Huang, Andrew McCallum and Elsa Olivetti. NIPS Workshop on Machine Learning for Molecules and Materials. Spotlight talk. (NIPS WS) 2017.
- Attending to All Mention Pairs for Full Abstract Biological Relation Extraction. Patrick Verga, Emma Strubell, Ofer Shai, and Andrew McCallum. 6th Workshop on Automated Knowledge Base Construction (AKBC NIPS WS) 2017.
- Materials synthesis insights from scientific literature via text extraction and machine learning. Edward Kim, Kevin Huang, Adam Saunders, Andrew McCallum, Gerbrand Ceder, Elsa Olivetti. Chemistry of Materials 29 (21), 9436-9444. 2017.
- Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples. Haw-Shiuan Chang, Erik Learned-Miller, Andrew McCallum. Neural Information Processing Conference (NIPS) 2017.
- Improved Representation Learning for Predicting Commonsense Ontologies. Xiang Lorraine Li, Luke Vilnis, Andrew McCallum. International Conference on Machine Learning Workshop on Deep Structured Prediction (ICML WS) 2017.
- Low-Rank Hidden State Embeddings for Viterbi Sequence Labeling. Dung Thai, Shikhar Murty, Trapit Bansal, Luke Vilnis, David Belanger, Andrew McCallum. International Conference on Machine Learning Workshop on Deep Structured Prediction (ICML WS) 2017.
- Unsupervised Hypernym Detection by Distributional Inclusion Vector Embedding. Haw-Shiuan Chang, ZiYun Wang, Luke Vilnis, Andrew McCallum. ArXiv preprint (ArXiv) 2017.
- RelNet: End-to-end Modeling of Entities & Relations. Trapit Bansal, Arvind Neelakantan, Andrew McCallum. NIPS Workshop on Automated Knowledge Base Construction (NIPS AKBC WS) 2017.
- Dependency Parsing with Dilated Iterated Graph CNNs. Emma Strubell, Andrew McCallum. 2nd Workshop on Structured Prediction for Natural Language Processing (EMNLP WS) 2017.
- An Online Hierarchical Algorithm for Extreme Clustering. Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum. Proceedings of Knowledge Discovery and Data Mining, oral presentation (KDD oral) 2017.
- Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks. Rajarshi Das, Manzil Zaheer, Siva Reddy, Andrew McCallum. Association of Computational Linguistics, short paper (ACL short) 2017.
- Fast and Accurate Sequence Labeling with Iterated Dilated Convolutions. Emma Strubell, Patrick Verga, David Belanger, Andrew McCallum. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2017.
- SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications. Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, Andrew McCallum. (SemEval) 2017.
- End-to-End Learning for Structured Prediction Energy Networks. David Belanger, Bishan Yang, Andrew McCallum. International Conference on Machine Learning (ICML) 2017.
- Learning a Natural Language Interface with Neural Programmer.
Arvind Neelakantan, Quoc V. Le, Martin Abadi, Andrew McCallum, Dario Amodei. Submitted to the International Conference on Learning Representations (ICLR), 2017.
- Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks. Rajarshi Das, Arvind Neelakantan, David Belanger, Andrew McCallum. European Association of Computational Linguistics (EACL), 2017.
- Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema. Patrick Verga, Arvind Neelakantan, Andrew McCallum. European Association of Computational Linguistics (EACL), 2017.
2016
- Structured Prediction Energy Networks. David Belanger and Andrew McCallum. International Conference on Machine Learning (ICML), 2016.
- Multilingual Relation Extraction using Compositional Universal Schema. Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth, Andrew McCallum. North American Association of Computational Linguistics (NAACL), 2016.
- Ask the GRU: Multi-task Learning for Deep Text Recommendations. Trapit Bansal, David Belanger, Andrew McCallum. Recommender Systems (RecSys), 2016.
- Call for Discussion: Building a New Standard Dataset for Relation Extraction Tasks. Teresa Martin and Fiete Botschen and Ajay Nagesh and Andrew McCallum. NAACL 2016 Workshop on Automated Knowledge Base Construction (AKBC), 2016.
- Incorporating Selectional Preferences in Multi-hop Relation Extraction. Rajarshi Das, Arvind Neelakantan, David Belanger, Andrew McCallum. NAACL 2016 Workshop on Automated Knowledge Base Construction (AKBC), 2016.
- Row-less Universal Schema. Patrick Verga and Andrew McCallum. NAACL Workshop on Automated Knowledge Base Construction (AKBC), 2016.
- Extracting Multilingual Relations under Limited Resources: TAC 2016 Cold-Start KB construction and Slot-Filling using Compositional Universal Schema. Haw-Shiuan Chang, Abdurrahman Munir, Ao Liu, Johnny Tian-ZhengWei, Aaron Traylor, Ajay Nagesh, Nicholas Monath, Patrick Verga, Emma Strubell and Andrew McCallum. Text Analysis Conferenc, Knowledge Base Population (TAC/KBP), 2016.
2015
- Structured Prediction Energy Networks. David Belanger, Andrew McCallum. ArXiv pre-print, submitted to ICLR and rejected, 2015.
- Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches Evgeniy Gabrilovich, Ramanathan Guha, Andrew McCallum, Kevin Murphy. AAAI Spring Symposium Series Technical Report, 2015.
- Multilingual Relation Extraction using Compositional Universal Schema. Pat Verga, David Belanger, Emma Strubell, Benjamin Roth, Andrew
McCallum. ArXiv pre-print, submitted to ICLR, 2016.
- Word Representations via Gaussian Embedding. Luke Vilnis, Andrew McCallum. International Conference on Learning Representations (ICLR) oral presentation, 2015.
- Compositional Vector Space Models for Knowledge Base Inference. Arvind Neelakantan, Benjamin Roth, Andrew McCallum. AAAI Spring Symposium Series (AAAI-SS), 2015.
- Bethe Projections for Non-Local Inference. Luke Vilnis, David Belanger, Dan Sheldon, Andrew McCallum. Conference on Uncertainty in Artificial Intelligence (UAI) 2015.
- Learning Dynamic Feature Selection for Fast Sequential Prediction.
Emma Strubell, Luke Vilnis, Kate Silverstein and Andrew McCallum. Annual Meeting of the Association for Computational Linguistics (ACL). Beijing, China. July 2015. Outstanding paper award.
- Compositional Vector Space Models for Knowledge Base Completion. Arvind Neelakantan, Benjamin Roth and Andrew McCallum. Annual Meeting of the Association for Computational Linguistics (ACL). Beijing, China. July 2015.
2014
- Training for Fast Sequential Prediction Using Dynamic Feature Selection. Emma Strubell, Luke Vilnis, and Andrew McCallum. NIPS Workshop on Modern Machine Learning and NLP (NIPS WS). Montreal, Quebec, Canada. December 2014.
- Knowledge Base Completion using Compositional Vector Space Models. Arvind Neelakantan, Benjamin Roth and Andrew McCallum. In 4th Workshop on Automated Knowledge Base Construction (AKBC) 2014 at NIPS. Outstanding Paper Award.
- Minimally Supervised Event Argument Extraction using Universal Schema.
Benjamin Roth, Emma Strubell, Katherine Silverstein and Andrew
McCallum. In 4th Workshop on Automated Knowledge Base Construction (AKBC) at NIPS, Montreal, Quebec, Canada. December 2014.
- Universal Schema for Slot-Filling, Cold-Start KBP and Event Argument Extraction: UMass IESL at TAC KBP 2014.
Benjamin Roth, Emma Strubell, John Sullivan, Lakshmi Vikraman,
Katherine Silverstein, and Andrew McCallum. Text Analysis Conference
(Knowledge Base Population Track) '14 Workshop (TAC KBP). Gaithersburg, Maryland, USA. November 2014.
- Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space. Arvind Neelakantan, Jeevan Shankar, Alexandre Passos and Andrew McCallum. Conference on Empirical
Methods in Natural Language Processing and Natural Language Learning (EMNLP), 2014.
- A Hierarchical Model for Universal Schema Relation Extraction.
Arvind Neelakantan, Alexandre Passos, Andrew McCallum. Workshop
on Automatic Creation and Curation of Knowledge Bases (WACCK) at SIGMOD, 2014.
- Message Passing for Soft Constraint Dual Decomposition. David Belanger, Alexandre Passos, Sebastian Riedel, Andrew McCallum. Uncertainty in Artificial Intelligence (UAI), 2014.
- Lexicon Infused Phrase Embeddings for Named Entity Resolution. Alexandre Passos, Vineet Kumar, Andrew McCallum. Conference on Computational Natural Language Learning (CoNLL), 2014.
- Learning Soft Linear Constraints with Application to Citation Field Extraction. Sam Anzaroot, Alexandre Passos, David Belanger, Andrew McCallum. Proceedings of the
Association for Computational Linguistics (ACL), 2014.
2013
- Optimization and Learning in FACTORIE. Alexandre Passos, Luke Vilnis,
Andrew McCallum. Neural Information Processing Systems Workshop on
Optimization for Machine Learning (NIPS WS), 2013.
- Marginal Inference in MRFs using Frank-Wolfe. David Belanger, Dan Sheldon, Andrew McCallum. Neural Information
Processing Systems Workshop on Greedy Optimization, Frank-Wolfe and Friends (NIPS WS), 2013.
- Anytime Belief Propagation Using Sparse Domains.
Sameer Singh, Sebastian Riedel, Andrew McCallum. Neural
Information Processing Systems Workshop on Resource-Efficient Machine
Learning (NIPS WS), 2013.
- Universal Schema for Slot Filling and Cold Start: UMass IESL at TACKBP.
Sameer Singh, David Belanger, Ari Kobren, Michael Wick, Alexandre
Passos, Harshal Pandya, Jinho Choi, Brian Martin, Andrew
McCallum. Text Analysis Conference (TAC), 2013.
- Universal Schema for Entity Type Prediction.
Limin Yao, Sebastian Reidel, Andrew McCallum. Third International
Workshop on Automated Knowledge Base Construction (AKBC), 2013.
- A Joint Model for Discovering and Linking Entities.
Michael Wick, Sameer Singh, Harshal Pandya, Andrew McCallum.
Third International Workshop on Automated Knowledge Base Construction (AKBC), 2013.
- Assessing Confidence of Knowledge Base Content with an Experimental Study in Entity Resolution.
Michael Wick, Sameer Singh, Ari Kobren, Andrew McCallum. Third
International Workshop on Automated Knowledge Base Construction (AKBC), 2013.
- Joint Inference of Entities, Relations, and Coreference.
Sameer Singh, Sebastian Riedel, Brian Martin, Jiaping Zheng, Andrew
McCallum. Third International Workshop on Automated Knowledge
Base Construction (AKBC), 2013.
- Dynamic Knowledge Base Alignment for Coreference Resolution.
Jiaping Zheng, Luke Vilnis, Sameer Singh, Jinho Choi, Andrew
McCallum. Seventeenth Conference on Computational Natural
Language Learning (CoNLL), 2013.
- Transition-based Dependency Parsing with Selectional Branching. Jinho D. Choi, Andrew McCallum, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), 2013.
- Open Scholarship and Peer Review: a Time for Experimentation. David Soergel, Adam Saunders, Andrew McCallum. ICML Workshop on Peer Reviewing and Publishing Models (PEER), 2013.
- A New Dataset for Fine-Grained Citation Field Extraction. Sam Anzaroot, Andrew McCallum. ICML Workshop on Peer Reviewing and Publishing Models (PEER), 2013.
- Large-scale Author Coreference via Hierarchical Entity Representations. Michael L Wick, Ari Kobren, Andrew McCallum. ICML Workshop on Peer Reviewing and Publishing Models (PEER), 2013.
- Wikilinks: A Large-scale Cross-Document Coreference Corpus Labeled via Links to Wikipedia. Sameer Singh, Amar Subramanya, Fernando Pereira, Andrew McCallum. Technical Report (TR) UMASS-CS-2012-015, October, 2012.
- Relation Extraction with Matrix Factorization and Universal Schemas.
Sebastian Riedel, Limin Yao, Benjamin M. Marlin and Andrew McCallum,
Joint Human Language Technology Conference/Annual Meeting of the North
American Chapter of the Association for Computational Linguistics (HLT-NAACL), 2013.
- Latent Relation Representations for Universal Schemas. Sebastian Riedel, Limin Yao, Andrew McCallum. International Conference on Learning Representations (ICLR), 2013.
2012
- MAP Inference in Chains using Column Generation.
David Bellanger, Alexandre Passos, Sebastian Riedel, Andrew
McCallum. Proceedings of Neural Information Processing (NIPS), 2012.
- Probabilistic Databases of Universal Schema. Limin Yao, Sebastian Riedel and Andrew McCallum, NAACL Workshop on Automatic Knowledge Base Construction (AKBC), 2012.
- Human Machine Cooperation with Epistemological DBs: Supporting User Corrections to Automatically Constructed KBs. Michael Wick, Karl Schultz, and Andrew McCallum. NAACL Workshop on Automatic Knowledge Base Construction (AKBC) 2012. (Best paper runner-up)
- Monte Carlo MCMC: Efficient Inference by Sampling Factors. Sameer Singh, Michael Wick, and Andrew McCallum. NAACL Workshop on Automatic Knowledge Base Construction (AKBC) 2012.
- Monte Carlo MCMC: Efficient Inference by Approximate Sampling.
Sameer Singh, Michael Wick, Andrew McCallum. Conference on Empirical
Methods in Natural Language Processing and Natural Language Learning (EMNLP), 2012.
- Combining joint models for biomedical event extraction. David McClosky, Sebastian Riedel, Minhai Surdeanu, Andrew McCallum, Christopher Manning. BMC Bioinformatics, 2012.
- Speeding up MAP with Column Generation and Block Regularization.
David Belanger, Alexandre Passos, Sebastian Riedel and Andrew McCallum,
ICML Workshop on Inferning: Interactions between Inference and
Learning, (ICML WS), 2012.
- Parse, Price and Cut - Delayed Column and Row Generation for Graph Based Parsers.
Sebastian Riedel, David A. Smith and Andrew McCallum, Proceedings of
the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2012.
- A Discriminative Hierarchical Model for Fast Coreference at Large Scale. Michael Wick, Sameer Singh, Andrew McCallum. Association for Computational Linguistics (ACL), 2012.
- Unsupervised Relation Discovery with Sense Disambiguation.
Limin Yao, Sebastian Riedel and Andrew McCallum. Proceedings of the
50th Annual Meeting of the Association for Computational Linguistics (ACL), 2012.
- Topic Models for Taxonomies. Anton Bakalov, Andrew McCallum, Hanna Wallach and David Mimno. Proceedings of the Joint Conference on Digital Libraries (JCDL), 2012.
- Selecting Actions for Resource-bounded Information Extraction using Reinforcement Learning. Pallika Kanani, Andrew McCallum. Web Search and Data Mining (WSDM), 2012.
2011
- Correlations and anticorrelations in LDA inference.
Alexandre Passos, Hanna Wallach, Andrew McCallum. Neural Information
Processing Systems Workshop on Challenges in Learning Hierarchical
Models: Transfer Learning and Optimization (NIPS WS), 2011.
- Inducing Value Sparsity for Parallel Inference in Tree-shaped Models.
Sameer Singh, Brian Martin, Andrew McCallum. Neural Information
Processing Systems Workshop on Computational Trade-offs in Statistical
Learning (NIPS WS), 2011.
- Towards Asynchronous Distributed MCMC Inference for Large Graphical Models.
Sameer Singh, Andrew McCallum. Neural Information Processing
Systems Workshop on Algorithms, Systems, and Tools for Learning at
Scale (NIPS WS), 2011.
- Query Aware McMC. Michael Wick and Andrew McCallum. Proceedings of Neural Information Processing Systems (NIPS), 2011.
- Toward Interactive Training and Evaluation. Greg Druck and Andrew McCallum. Conference on Information and Knowledge Mangement (CIKM), 2011.
- Model Combination for Event Extraction in BioNLP.
Sebastian Riedel, David McClosky, Mihai Surdeanu, Christopher D.
Manning and Andrew McCallum. Proceedings of the Natural Language
Processing in Biomedicine NAACL 2011 Workshop (BioNLP), 2011.
- Robust Biomedical Event Extraction with Dual Decomposition and Minimal Domain Adaptation.
Sebastian Riedel and Andrew McCallum. Proceedings of the Natural
Language Processing in Biomedicine NAACL 2011 Workshop (BioNLP), 2011.
- Inter-Event Dependencies support Event Extraction from Biomedical Literature.
Roman Klinger, Sebastian Riedel and Andrew McCallum. Mining Complex
Entities from Network and Biomedical Data (MIND), Proceedings of the
European Conference on Machine Learning and Knowledge Discovery in
Databases (ECML PKDD), 2011.
- Structured Relation Discovery using Generative Models. Limin Yao, Aria Haghighi, Sebastian Riedel, Andrew McCallum. Empirical Methods in Natural Language Processing (EMNLP), 2011.
- Fast and Robust Joint Models for Biomedical Event Extraction. Sebastian Riedel, Andrew McCallum. Empirical Methods in Natural Language Processing (EMNLP), 2011.
- Optimizing Semantic Coherence in Topic Models.
David Mimno, Hanna Wallach, Edmund Talley, Miriam Leenders, Andrew
McCallum. Empirical Methods in Natural Language Processing (EMNLP), 2011.
- SampleRank: Training Factor Graphs with Atomic Gradients.
Michael Wick, Khashayar Rohanimanesh, Kedar Bellare, Aron Culotta,
Andrew McCallum. Proceedings of the International Conference on Machine
Learning (ICML), 2011.
- Database of NIH grants using machine-learned categories and graphical clustering.
Edmund M Talley, David Newman, David Mimno, Bruce W Herr II, Hanna M
Wallach, Gully Burns, Miriam Leenders, Andrew McCallum. Nature Methods, 8, 443–444, 27 May 2011.
- Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models. Sameer Singh, Amarnag Subramanya, Fernando
Pereira, Andrew McCallum. Association for Computational Linguistics: Human Language Technologies (ACL HLT), 2011
2010
- An Introduction to Conditional Random Fields.
Charles Sutton, Andrew McCallum. Foundations and Trends in Machine
Learning (FnT ML), to appear.
- Distantly labeling data for large scale cross-document coreference. Sameer Singh, Michael Wick, Andrew McCallum. Technical report on arXiv (TR), 2010.
- Distributed MAP Inference for Undirected
Graphical Models. Sameer Singh, Amarnag Subramanya, Fernando
Pereira, Andrew McCallum. Neural Information Processing Systems
Workshop on Learning on Cores, Clusters, and Clouds (NIPS WS),
2010.
- Machine Translation Using Overlapping Alignments and SampleRank. Benjamin
Roth, Andrew McCallum, Marc Dymetman and Nicola Cancedda. Proceedings
of the Ninth Conference of the Association for Machine Translation in
the Americas (AMTA), 2010.
- High-Performance
Semi-Supervised Learning using Discriminatively Constrained Generative
Models. Gregory Druck, Andrew McCallum. International Conference on
Machine Learning (ICML), 2010.
- Constraint-Driven
Rank-Based Learning for Information Extraction Sameer Singh, Limin
Yao, Sebastian Riedel, Andrew McCallum. Conference of the North
American Chapter of the Association for Computational Linguistics (NAACL
HLT),
- Collective
Cross-Document Relation Extraction Without Labelled Data. Limin
Yao, Sebastian Riedel, Andrew McCallum. Proceedings of Empirical
Methods in Natural Language Processing (EMNLP), 2010.
- Modeling Relations and Their Mentions without
Labeled Text. Sebastian Riedel, Limin Yao, Andrew McCallum.
Proceedings of the European Conference on Machine Learning (ECML/PKDD),
2010.
- Resource-bounded
Information Extraction: Acquiring Missing Feature Values On Demand.
Pallika H. Kanani, Andrew McCallum, Shaohan Hu. Proceedings of the 14th
PA Conference on Knowledge Discovery and Data Mining (PAKDD),
2010. (Best paper runner-up.)
- Scalable
Probabilistic Databases with Factor Graphs and MCMC. Michael Wick,
Andrew McCallum, Gerome Miklau. Proceedings of the International
Conference on Very Large Databases (VLDB), 2010.
2009
- FACTORIE:
Probabilistic Programming via Imperatively Defined Factor Graphs.
Andrew McCallum, Karl Schultz, Sameer Singh. Neural Information
Processing Systems (NIPS), 2009.
- Rethinking
LDA: Why Priors Matter. Hanna Wallach, David Mimno, Andrew
McCallum. Neural Information Processing Systems (NIPS),
2009.
- Training
Factor Graphs with Reinforcement Learning for Efficient MAP Inference..
Michael Wick, Khashayar Rohanimanesh, Sameer Singh, Andrew McCallum.
Neural Information Processing Systems (NIPS), 2009.
- SampleRank:
Learning Preferences from Atomic Gradients. Michael Wick, Khashayar
Rohanimanesh, Aron Culotta, Andrew McCallum. Neural Information
Processing Systems Workshop on Advances in Ranking (NIPS WS),
2009.
- Bi-directional Joint Inference for Entity Resolution and
Segmentation using Imperatively-Defined Factor Graphs. Sameer
Singh, Karl Schultz, Andrew McCallum. European Conference on Machine Learning and
Principles and Practice of Knowledge Discovery in Databases (ECML
PKDD), 2009.
- Efficient
Methods for Topic Model Inference on Streaming Document Collections.
Limin Yao, David Mimno and Andrew McCallum. Conference on Knowledge
Discovery and Data Mining (KDD), 2009, Paris, France.
- Generalized
Expectation Criteria for Bootstrapping Extractors using Record-Text
Alignment. Kedar Bellare and Andrew McCallum. Proceedings of
Empirical Methods in Natural Language Processing (EMNLP) 2009,
Singapore (EMNLP), 2009
- Polylingual
Topic Models.
David Mimno, Hanna Wallach, Jason Naradowsky, David Smith and Andrew
McCallum. Proceedings of the Empirical Methods in Natural Language
Processing (EMNLP), Singapore, 2009.
- Active
Learning by Labeling Features. Gregory Druck, Burr Settles, Andrew
McCallum. Proceedings of the Empirical Methods in Natural Language
Processing (EMNLP).
- Inference
and Learning in Large Factor Graphs with Adaptive Proposal Distributions.
Khashayar Rohanimanesh, Michael Wick, Andrew McCallum. University of
Massachusetts Technical Report #UM-CS-2009-008 (TR), 2009
- Advances
in Learning and Inference for Partition-wise Models of Coreference
Resolution. Michael Wick and Andrew McCallum. University of
Massachusets Technical Report # UM-CS-2009-028 (TR), 2009
- Representing
Uncertainty in Databases with Scalable Factor Graphs. Michael Wick,
Masters Thesis/Synthesis. Readers: Andrew McCallum and Gerome Miklau.
April 2009
- An
Entity Based Model for Coreference Resolution.
Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew McCallum.
Proceedings of the SIAM International Conference on Data Mining (SDM),
Reno, Nevada, 2009
- Alternating
Projections for Learning with Expectation Constraints. Kedar
Bellare, Gregory Druck and Andrew McCallum. Uncertainty in Artificial
Intelligence (UAI), 2009
- Semi-supervised
Learning of Dependency Parsers using Generalized Expectation Criteria.
Gregory Druck, Gideon Mann, Andrew McCallum. Proceedings of the
Association for Computational Linguistics (ACL).
- Towards Theoretical Bounds for Resource-bounded Information Gathering for Correlation Clustering. Pallika Kanani, Andrew McCallum, Ramesh Sitaraman. UMass TechReport UM-CS-2009-027 (TR), 2009.
- Generalized Expectation Criteria with application to
Semi-Supervised Classification and Sequence Modeling. Gideon Mann
and Andrew McCallum. Journal of Machine Learning Research (JMLR).
To appear.
2008
- Reinforcement
Learning for MAP Inference in Large Factor Graphs.
Khashayar Rohanimanesh, Michael Wick, Sameer Singh, and Andrew
McCallum. University of Massachusetts Technical Report #UM-CS-2008-040 (TR),
2008
- Gibbs
Sampling for Logistic Normal Topic Models with Graph-Based Priors.
David Mimno, Hanna Wallach and Andrew McCallum. NIPS Workshop on
Analyzing Graphs, (NIPS WS), 2008, Whistler, BC.
- FACTORIE:
Efficient Probabilistic Programming for Relational Factor Graphs via
Imperative Declarations of Structure, Inference and Learning.
Andrew McCallum, Khashayar Rohanemanesh, Michael Wick, Karl Schultz,
Sameer Singh. NIPS Workshop on Probabilistic Programming, (NIPS
WS), 2008. (Discriminatively
trained undirected graphical models, or conditional random fields, have
had wide empirical success, and there has been increasing interest in
toolkits that ease their application to complex relational data.
Although there has been much historic interest in the combination of
logic and probability, we argue that in this mixture 'logic' is largely
a red herring. The power in relational models is in their repeated
structure and tied parameters; and logic is not necessarily the best
way to define these structures. Rather than using a declarative
language, such as SQL or first-order logic, we advocate using an
object-oriented imperative language to express various aspects of model
structure, inference and learning. By combining the traditional,
declarative, statistical semantics of factor graphs with imperative
definitions of their construction and operation, we allow the user to
mix declarative and procedural domain knowledge, and also gain
significant efficiencies. We have implemented our ideas in a system we
call FACTORIE, a software library for an object-oriented,
strongly-typed, functional JVM language named Scala.)
- A Discriminative
Approach to Ontology Alignment.
Michael Wick, Khashayar Rohanimanesh, Andrew McCallum, and AnHai Doan.
In the International Workshop on New Trends in Information Integration
(NTII) at the conference for Very Large Databases (VLDB WS),
Auckland, New Zealand, 2008. (New
state-of-the-art results on ontology alignment using graph-shaped
conditional random fields, joint inference, and parameter estimation by
Rank-Based Training.)
- A Unified Approach for
Schema Matching, Coreference, and Canonicalization. Michael Wick,
Khashayar Rohanimanesh, Karl Schultz, Andrew McCallum. In Conference on
Knowledge Discovery and Data Mining (KDD). 2008. (Information
integration, performing joint inference over schema matching, entity
resolution and canonicalization, using conditional random fields,
features encoding clauses in first-order logic, and efficient inference
by Metropolis-Hastings. Positive experimental results on multiple data
sets.)
- Unsupervised Deduplication
using Cross-field Dependencies. Robert Hall, Charles Sutton, Andrew
McCallum. In Conference on Knowledge Discovery and Data Mining (KDD).
2008. (Hierarchical
Dirichlet process model that jointly clusters citation venue strings
based on both string-edit distance and title information.)
- Bayesian Modeling of
Dependency Trees Using Hierarchical Pitman-Yor Priors.
Hanna Wallach, Charles Sutton, Andrew McCallum. In International
Conference on Machine Learning, Workshop on Prior Knowledge for Text
and Language Processing. (ICML WS), 2008. (Two
Bayesian dependency parsing models: 1. Model with Pitman-Yor prior that
significantly improves Eisner's classic model; 2. Latent-variable model
that learns "syntactic" topics.)
- Learning from Labeled
Features using Generalized Expectation Criteria. Gregory Druck,
Gideon Mann and Andrew McCallum. Proceedings of ACM Special Interest
Group on Information Retreival, (SIGIR), 2008. (Learn
classifiers by labeling features rather than instances. Extensive
evaluation on many text data sets, showing substantial improvement over
other methods of semi-supervised learning.)
- Learning to Predict
the Quality of Contributions to Wikipedia. Gregory Druck, Gerome
Miklau and Andrew McCallum. AAAI Workshop on Wikipedia and AI, (AAAI
WS), 2008. (Predict
the longevity of an edit to Wikipedia, using textual features of the
edit as well as features of the editor. Could be part of a tool to
prioritize verification of changes to Wikipedia.)
- Topic Models Conditioned on
Arbitrary Features with Dirichlet-multinomial Regression. David
Mimno and Andrew McCallum. (Plenary presentation.) Conference on
Uncertainty in Artificial Intelligence, (UAI), 2008. (Text
documents are usually accompanied by metadata, such as the authors, the
publication venue, the date, and any references. Work in topic modeling
that has taken such information into account, such as Author-Topic,
Citation-Topic, and Topic-over-Time models, has generally focused on
constructing specific models that are suited only for one particular
type of metadata. This paper presents a simple, unified model for
learning topics from documents given arbitrary non-textual features,
which can be discrete, categorical, or continuous.)
- Generalized Expectation
Criteria for Semi-Supervised Learning of Conditional Random Fields.
Gideon Mann and Andrew McCallum. Proceedings of Association of
Computational Linguistics, (ACL), 2008. (Generalized expectation for semi-supervised learning
of linear-chain conditional random fields.)
- Piecewise Training for
Structured Prediction. Charles Sutton and Andrew McCallum. Accepted
to the Machine Learning Journal, (MLJ), 2008. (Efficiently train CRFs in parts. It works well even
though full joint inference is used at test time.)
- Pachinko Allocation:
Scalable Mixture Models of Topic Correlations. Wei Li and Andrew
McCallum. Submitted to the Journal of Machine Learning Research, (JMLR),
2008. (The
pachinko allocation model represents nested correlations among topics
using a DAG. This paper has work is in efficiently fitting these
models, (as well as plain old LDA) by creating and leveraging sparsity
in the distribution over topics to be sampled for each document.)
2007
- Unsupervised
Coreference of Publication Venues . Robert Hall, Charles Sutton
and Andrew McCallum. University of Massachusetts Amherst Technical
Report, (TR), 2007. (A
generative non-parametric mixture model for entity resolution of
publication venues that leverages both the venue titles as well as
distributions over words in paper titles.)
- Generalized Expectation
Criteria. Andrew McCallum, Gideon Mann and Gregory Druck.
University of Massachusetts Amherst Technical Report #2007-60, (TR),
2007. (This note introduces and motivates Generalized
Expectation
(GE) criteria. GE criteria are terms in a parameter-estimation
objective function that express preferences about model expectations.
In certain simple cases, GE falls into the same equivalence class as
moment matching, maximum likelihood and maximum entropy estimation.
However, our work focusses on leveraging GE's special flexibility in
three non-traditional ways: (1) GE criteria can be specified indepently
of the model parameterization. In factor graphs, we break the
traditional one-to-one mapping between (a) subsets of variables
participating in parametered model factors and (b) subsets of variables
over which the objective function's expectations are calculated. (2)
Within the same objective function, multiple GE terms that are
conditional expectations can be conditioned on multiple different data
sets. This is useful for semi-supervised learning and transfer
learning. (3) A target expectation (or more generally the expectation
preference function can come from any source, including other tasks or
human domain knowledge. GE is the successor to Expectation
Regularization, which is described in our ICML 2007 paper below.)
- Reducing Annotation
Effort using Generalized Expectation Criteria--DRAFT. Gregory
Druck, Gideon Mann and Andrew McCallum. University of Massachusetts
Amherst Technical Report #2007-62, (TR), 2007. (A version of Generalized Expectation (GE) in which the
supervision is provided by labeling features instead of instances.
Dramatically faster wall-clock labeling to acheive high accuracy.
Experiments on document classification.)
- Community-based Link
Prediction with Text.
David Mimno, Hanna M. Wallach and Andrew McCallum. In Proceedings of
the NIPS 2007 Workshop on Statistical Network Modeling (NIPS WS), 2007.
(New state-of-the-art results in
link-prediction
using a latent-variable topic model, in which "community" variables are
associated with topic distributions and author distributions. Thus the
model combines the use of language/topics and co-authorships to
discover communities.)
- Leveraging
Existing Resources using Generalized Expectation Criteria. Gregory
Druck, Gideon Mann and Andrew McCallum. NIPS Workshop on Learning
Problem Design, (NIPS WS), 2007. (Generalized
Expectation applied in situations in which there is no labeled data.
All supervision is obtained form existing auxiliary resources such as
lexicons. Experiments on information extraction.)
- Lightly-Supervised
Attribute Extraction for Web Search.
Kedar Bellare, Partha Pratim Talukdar, Giridhar Kumaran, Fernando
Pereira, Mark Liberman, Andrew McCallum and Mark Dredze. NIPS Workshop
on Machine Learning for Web Search, (NIPS WS), 2007. (Extract
a large number of attributes of different entities from natural
language text. Methods based on co-training and maximum entropy
classifiers.)
- People-LDA:
Anchoring Topics to People Using Face Recognition. Vidit Jain, Erik
Learned-Miller, and Andrew McCallum. International Conference on
Computer Vision (ICCV), 2007. (Jointly
model people's identity, face appearance in an image, and surrounding
text in the image captions with an LDA-style topic model. Improved
results in identifying coherent sets of person "mentions"---that is,
improved co-reference by using both text and image features.)
- Joint Group and Topic
Discovery from Relations and Text.
Andrew McCallum, Xuerui Wang and Natasha Mohanty, Statistical Network
Analysis: Models, Issues and New Directions, Lecture Notes in Computer
Science 4503, pp. 28-44, (Book chapter), 2007. (Book
chapter version of NIPS 2006 conference paper. Social network analysis
that simultaneously discovers groups of entities and also clusters
attributes of their relations, such that clustering in each dimension
in forms the other. Applied to the voting records and corresponding
text of resolutions from the U.S. Senate and the U.N., showing that
incorporating the votes results in more salient topic clusters, and
that different groupings of legislators emerge from different topics.)
- Topical N-grams: Phrase
and Topic Discovery, with an Application to Information Retrieval.
Xuerui Wang, Andrew McCallum and Xing Wei, Proceedings of the 7th IEEE
International Conference on Data Mining (ICDM), 2007. (A topic model in the LDA style that uses a Markov
model to automatically discover topically-relevant arbitrary-length phrases,
not just lists of single words. The phrase discovery is not simply a
post-processing step, but an intrinsic part of the model that helps it
discover better topics. Experiments on document retrieval tasks.)
- Canonicalization of
Database Records using Adaptive Similarity Measures.
Aron Culotta, Michael Wick, Robert Hall, Matthew Marzilli and Andrew
McCallum. Conference on Knowledge Discovery and Data Mining (KDD),
2007. (Defines
and explores the problem of "canonicalization"---selecting the best
field values for a single, standard record formed from a set of
consolodated, co-resolved information sources, such as arise from
merging databases, or combining multiple sources of information
extraction.)
- Generalized Component
Analysis for Text with Heterogeneous Attributes. Xuerui Wang, Chris
Pal and Andrew McCallum. Conference on Knowledge Discovery and Data
Mining (KDD), 2007. (A topic
model based on an undirected graphical model, which makes it easier to
incorporate multiple modalities.)
- Semi-Supervised
Classification with Hybrid Generative/Discriminative Methods.
Greg Druck, Chris Pal, Xiaojin Zhu and Andrew McCallum. Conference on
Knowledge Discovery and Data Mining (KDD), 2007. (Leverage
unlabeled data for text classification by using an objective function
that combines (1) joint probability of labels and words and (2)
conditional probability of labels give words.)
- Expertise Modeling for
Matching Papers with Reviewers. David Mimno and Andrew McCallum.
Conference on Knowledge Discovery and Data Mining (KDD),
2007. (The
Author-Persona-Topic model is a LDA-style topic model especially
designed to represent expertise as a mixture of topical intersections.
We show positive results in matching reviewers to conference papers, as
assessed by human judgements.)
- Learning Extractors
from Unlabeled Text using Relevant Databases. Kedar Bellare and
Andrew McCallum. Sixth International Workshop on Information
Integration on the Web (IIWeb), collocated with AAAI,
2007. (Use
conditional random fields to learn information extractors both from DB
fields and from alignments of DB in free text. Uses an Alignment CRF,
similar to our UAI 2005 paper.)
- Efficient Strategies
for Improving Partitioning-Based Author Coreference by Incorporating
Web Pages as Graph Nodes. Pallika Kanani and Andrew McCallum. Sixth
International Workshop on Information Integration on the Web (IIWeb),
collocated with AAAI, 2007. (Improve
entity resolution by adding web pages as new "mentions" to the
graph-partitioning problem, and do so efficiently by selecting a subset
of the possible queries and a subset of the returned pages.)
- Probabilistic
Representations for Integrating Unreliable Data Sources. David
Mimno and Andrew McCallum. Sixth International Workshop on Information
Integration on the Web (IIWeb), collocated with AAAI,
2007. (Probabilistic representation of field
values used in merging and augmenting information from DBPL and
research paper PDFs.)
- Author Disambiguation
using Error-Driven Machine Learning With a Ranking Loss Function.
Aron Culotta, Pallika Kanani, Robert Hall, Michael Wick, and Andrew
McCallum. Sixth International Workshop on Information Integration on
the Web (IIWeb), collocated with AAAI, 2007. (Entity
resolution of people using high-order features, made efficient with
Metropolis-Hastings and SampleRank, a learning method based ranking.)
- Nonparametric Bayes
Pachinko Allocation. Wei Li, David Blei and Andrew McCallum.
Conference on Uncertainty in Artificial Intelligence (UAI),
2007. (A
version of pachinko allocation that automatically determines the number
of topics (and super-topics), and its sparse connectivity structure by
Dirichlet process priors. Positive results in redisovering known
structure in synthetic data, and in held-out likelihood versus PAM,
hLDA and HDP.)
- Improved Dynamic Schedules
for Belief Propagation. Charles Sutton and Andrew McCallum.
Conference on Uncertainty in Artificial Intelligence (UAI),
2007. (Significantly
faster inference in graphical models by selecting which BP messages to
send based on an approximation to their residual.)
- Simple, Robust, Scalable
Semi-supervised Learning via Expectation Regularization. Gideon
Mann and Andrew McCallum. International Conference on Machine Learning (ICML),
2007. (Semi-supervised
learning is seldom used in real applications because it is often
complicated to implement, fragile in tuning or inefficient for large
data. We introduce a new highly usable approach to semi-supervised
learning, augmenting traditional label log-likelihood with an
additional term that encourages model predictions on unlabeled data to
match certain expectations. Positive results on 5 data sets versus EM,
transductive SVM, entropy regularization and a graph-based method.)
- Piecewise
Pseudolikelihood for Efficient Training of Conditional Random Fields.
Charles Sutton and Andrew McCallum. ICML, 2007. (Train
a large CRF in five times faster by dividing it into separate pieces
and reducing numbers of predicted variable combinations with
pseudolikelihood. Analysis in terms of belief propagation and Bethe
energy.)
- Mixtures of Hierarchical
Topics with Pachinko Allocation. David Mimno, Wei Li and Andrew
McCallum. ICML, 2007. (From
a large document collection automatically discover topic hierarchies,
where documents may be flexibly represented as mixtures across multiple
leaves, not just mixtures up and down a single leaf-root path. Thus,
for example, we can represent a document about instructing a robot
in natural language,
where those two topics are leaves. This new model, hPAM, combines the
best of pachinko allocation (PAM) and hierarchical LDA (hLDA). Dramatic
improvements in held-out data likelihood and mutual information between
discovered topics and human-assigned categories.)
- Transfer Learning for
Enhancing Information Flow in Organizations and Social Networks.
Chris Pal, Xuerui Wang and Andrew McCallum. Submitted to Conference on
Email and Spam (CEAS), 2007. Technical Note. (Continuous
hidden varable conditional random field for CC prediction/suggestion in
email.)
- Topic and Role Discovery in
Social Networks with Experiments on Enron and Academic Email.
Andrew McCallum, Xuerui Wang and Andres Corrada-Emmanuel. Journal of
Artificial Intelligence Research (JAIR), 2007. (Journal paper version of IJCAI conference paper on
Author-Recipient-Topic (ART) model.)
- Efficient
Computation of Entropy Gradient for Semi-Supervised Conditional Random
Fields. Gideon Mann and Andrew McCallum. NAACL/HLT,
(short paper) 2007. (A new, faster dynamic
program for calculating the entropy of a finite-state subsequence and
its gradient.)
- First-Order
Probabilistic Models for Coreference Resolution. Aron Culotta,
Michael Wick, Robert Hall and Andrew McCallum. NAACL/HLT,
2007. (Traditional
coreference uses features only over pairs of mentions. Here we present
a conditional random field with first-order logic for expressing
features, enabling features over sets of mentions. The result
is a new state-of-the-art results on ACE 2004 coref, jumping from 69 to
79---a 45% reduction in error. The advance depends crucially on a new
method of parameter estimation for such "weighted logic" models based
on learning rankings and error-driven training.)
- Sparse Message Passing
Algorithms for Weighted Maximum Satisfiability. Aron Culotta,
Andrew McCallum, Bart Selman, Ashish Sabharwal. New England Student
Symposium on Artificial Intelligence (NESCAI), 2007. (A
new algorithm for solving weighted maximum satisfiability (WMAX-SAT)
problems that divides a large problem into sub-problems, and
coordinates the global solution by message passing with sparse
messages. Inspired by the desire to do joint-inference in (a) large
weighted logics ala Markov Logic Networks, (b) large NLP pipelines, in
which there are efficient pre-existing (dynamic programming) solutions
to sub-parts of the pipeline. Positive results versus WalkSAT!)
- Cryptogram Decoding for
OCR using Numerzation Strings. Gary Huang, Erik Learned-Miller and
Andrew McCallum. ICDAR, 2007. (Robust
OCR without font appearance models by incorporating language modeling.)
- Penn/UMass/CHOP
BiocreativeII Systems.
Kuzman Ganchev, Koby Crammer, Fernando Pereira, Gideon Mann, Kedar
Bellare, Andrew McCallum, Steven Carroll, Yang Jin, and Peter White. BiocreativeII
Evaluation Workshop. 2007. (Description of our
high-ranking entry in the competition for extraction and linkage from
bioinformatics text.
- Resource-bounded
Information Gathering for Correlation Clustering. Pallika Kanai and
Andrew McCallum. Conference on Computational Learning Theory (COLT)
Open Problems Track, 2007. (We
present a new class of problems in which the goal is to perform
correlational clustering under circumstances in which accuracy can be
improved by augmenting the given graph with additional information.)
- Organizing the OCA:
Learning faceted subjects from a library of digital books. David
Mimno and Andrew McCallum. Joint Conference on Digital Libraries (JCDL),
2007. (Introduces
the DCM-LDA topic model, which represents topics by a
Dirichlet-compound-multinomial rather than a multinomial. In addition
to obtaining interesting information about the difference varianes of
the topics, this model lends itself to efficient parallelization with
very coarse-grained synchronization. The result is a topic model that
can run on over 1 billion words in just a few hours.)
- Mining a digital
library for influential authors. David Mimno and Andrew McCallum.
Joint Conference on Digial Libraries (JCDL), 2007. (A
probabilistic model that ranks authors based on their influence on
particular areas of scientific research. Integrates topics with
citation patterns.)
- Improving Author
Coreference by Resource-bounded Information Gathering from the Web.
Pallika Kanani, Andrew McCallum and Chris Pal. International Joint
Conference on Artificial Intelligence (IJCAI), 2007. (Sometimes
there is simply insufficient information to make an accurate entity
resolution decision, and we must gather additional evidence. This paper
describes the use of web queries to improve research paper author
coreference, exploring two methods of augmenting a graph partitioning
problem: using the web to obtain new features on existing edges, and
use the web to obtain new nodes in the graph. We then go on to describe
decision-theoretic approaches for maximizing accuracy gain with a
limited budget of web queries, and demonstrate our methods on three
large data sets.)
- Dynamic Conditional
Random Fields. Charles Sutton, Andrew McCallum and Khashayar
Rohanimanesh. Journal of Machine Learning Research (JMLR),
Vol. 8(Mar), pages 693-723, 2007. (Journal paper
version of ICML paper by the same authors, with new experiments on
marginal likelihood training.)
2006
- On Discriminative
and Semi-Supervised Dimensionality Reduction.
Chris Pal, Michael Kelm, Xuerui Wang, Greg Druck and Andrew McCallum.
Advances in Neural Information Processing Systems, Workshop on Novel
Applications of Dimensionality Reduction, (NIPS Workshop),
2006. (Using
Multi-Conditional Learning, learn to distribute mixture components just
were needed to address some discriminative task. See compelling figure
on synthetic overlapping spiral data.)
- Learning Field
Compatibilities to Extract Database Records from Unstructured Text.
Michael Wick, Aron Culotta and Andrew McCallum. Empirical Methods in
Natural Language Processing (EMNLP), 2006. (Record extraction, jointly accounting for multi-field
compatibility by content and layout features.)
- Tractable Learning
and Inference with Higher-Order Representations. Aron Culotta and
Andrew McCallum. ICML Workshop on Open Problems in
Statistical Relational Learning, 2006. (When
working with CRFs having features based on first-order logic, the
"unrolled" graphical model would be far to large to fully instantiate.
This paper describes a method leveraging MCMC to perform inference and
learning while only partially instantiating the model. Positive results
on entity resolution (of research papr authors) are described.)
- Corrective Feedback
and Persistent Learning for Information Extraction. Aron Culota,
Trausti Kristjansson, Andrew McCallum, Paul Viola. Artificial
Intelligence Journal (AIJ), volume 170, pages
1101-1122, 2006. (Help
a user interactively correct the results of extraction by providing
uncertainty cues in the UI, and by using constrained Viterbi to
automatically make additional corrections after the first human
correction. Journal paper version of AAAI paper by the same authors
below. Adds experiments with active learning.)
- CC Prediction with
Graphical Models. Chris Pal and Andrew McCallum. Conference on
Email and Anti-Spam (CEAS), 2006. (Help
keep an organization coordinated by suggesting who to carbon-copy on
your outgoing email message.)
- Practical Markov
Logic Containing First-order Quantifiers with Application to Identity
Uncertainty. Aron Culotta, Andrew McCallum. HLT Workshop
on Computationally Hard Problems and Joint Inference in Speech and
Language Processing, 2006. (Markov
Logic Networks are Conditional Random Fields that use first-order logic
to define features and parameter tying patterns. Making such models
scale to non-trivial data set sizes is a challenge because the size of
the full instantiation of the model is exponential in the arity of the
formulae. Here we describe a method of partial instantiation that
allows such models to scale to entity resolution problems millions of
entity mentions. On both citation and author entity resolution problems
we show that inclusing such first-order features provides increases in
accuracy.)
- A Continuous-Time
Model of Topic Co-occurrence Trends. Xuerui Wang, Wei Li, and
Andrew McCallum. AAAI Workshop on Event Detection,
2006. (Capture
the time distributions not only of a topics, but also of their
co-occurrences. For example, notice that while NLP and ML have both
been around for a long time, but their co-occurrence has been rising
recently. The model is effectively a combination of the Pachinko
Allocation Model (PAM) and Topics-Over-Time (TOT).)
- Combining Generative
and Discriminative Methods for Pixel Classification with
Multi-Conditional Learning. Michael Kelm, Chris Pal, and Andrew
McCallum. Draft accepted to the International Conference on Pattern
Recognition (ICPR), 2006. (Multi-conditional
learning explored in the context of computer vision.)
- Multi-Conditional Learning:
Generative/Discriminative Training for Clustering and Classification.
Andrew McCallum, Chris Pal, Greg Druck, Xuerui Wang. AAAI,
2006. (Estimate
parameters of an undirected graphical model not by joint likelihood, or
conditional likelihood, but by a product of multiple conditional
likelihoods. Can act as an improved regularizer. With latent variables,
can cluster structured, relational data, like Latent Dirichlet
Allocation and its successors, but with undirected graphical models and
(cross-cutting) conditional-training. Improved results on document
classification, Jebara-inspired synthetic data, and over the Harmonium
as tested on an information retreival task.)
- Pachinko Allocation:
DAG-structured Mixture Models of Topic Correlations. Wei Li, and
Andrew McCallum. ICML, 2006. (An
LDA-style topic model that captures correlations between topics,
enabling discovery of finer-grained topics. Similar motivations to Blei
and Lafferty's Correlated Topic Model (CTM), but uses a DAG to capture
arbitrary, nested and possibly sparse correlations among topics.
Interior nodes of the DAG have a Dirichlet distribution over their
children; words are in the leaves. Provides improved interpretability
and held-out data likelihood.)
- Topics over Time: A
Non-Markov Continuous-Time Model of Topical Trends. Xuerui Wang and
Andrew McCallum. Conference on Knowledge Discovery and Data Mining (KDD)
2006. (A
new LDA-style topic model that models trends over time. The meaning of
a topic remains fixed and reliable, but its prevalence over time is
captured, and topics may thus focus in on co-occurrence patterns that
are time-sensitive. Unlike other work that relies on Markov assumptions
or discretization of time, here each topic is associated with a
continuous distribution over timestamps. Improvements in topic saliency
and the ability to predict time given words.)
- Exploring the
Use of Conditional Random Field Models and HMMs for Historical
Handwritten Document Recognition. Shaolei L. Feng, R. Manmatha and
Andrew McCallum. IEEE International Conference on Document Image
Analysis for Libraries (DIAL 06), pp. 30-37. 2006. (Mixed results on CRFs applied to handwritten word
recognition.)
- Reducing Weight
Undertraining in Structured Discriminative Learning. Charles
Sutton, Michael Sindelar, and Andrew McCallum. HLT-NAACL,
2006. (Train
separately CRFs with different subsets of the features, then integrate
them at test time---four different variations on the method. Especially
make more reliable use of lexicon features and other highly-predictable
but brittle features.)
- Integrating
Probabilistic Extraction Models and Relational Data Mining to Discover
Relations and Patterns in Text. Aron Culotta, Andrew McCallum and
Jonathan Betz. HLT-NAACL, 2006. (Extract
relations from Wikipedia articles. Run data mining on the relational
graph to obtain patterns that are predictive of relations---such as
"opponent of my opponent is my ally" and "a person is likely to have
the same religion as their parents." Then use feaures derived from
these patterns in a second run of extraction that improves accuracy.)
- Bibliometric Impact
Measures Leveraging Topic Analysis. Gideon Mann, David Mimno and
Andrew McCallum. Joint Conference on Digital Libraries (JCDL)
2006. (Use
a new topic model that leverages n-grams to discover interpretable,
fine-grained topics in over a million research papers. Use these topic
divisions as well as automated citation analysis to extend three
existing bibliometric impact measures, and create three new ones:
Topical Diversity, Topical Transfer, Topical Precedence.)
- An Introduction to
Conditional Random Fields for Relational Learning. Charles Sutton
and Andrew McCallum. Book chapter in Introduction
to Statistical Relational Learning. Edited by Lise Getoor and Ben
Taskar. MIT Press. 2006. (An
overview and introduction to conditional random fields for beginners
and experts alike---motivation, background, mathematical foundations,
linear-chain form, general-structure form, inference, parameter
estimation, tips and tricks, an example application to information
extraction with a skip-chain structure.)
- Sparse Forward-Backward
using Minimum Divergence Beams for Fast Training of Conditional Random
Fields. Chris Pal, Charles Sutton, and Andrew McCallum. In
International Conference on Acoustics, Speech, and Signal Processing (ICASSP),
2006.
(An alternative method for beam-search based on variational principles.
Enables not only faster test-time performance of large-state-space
CRFs, but this method makes beam search robust enough to be used at
training time, enabling dramatically faster learning of discriminative
finite-state methods for speech, IE and other applications.)
- Table extraction
for answer retrieval. Xing Wei, Bruce Croft and Andrew McCallum.
Information Retrieval Journal (IRJ), volume 9, issue
5, pages 589-611, November 2006. (Information
extraction from tables, using conditional random fields with language
and layout features, with application to question answering. Journal
paper version of our SIGIR 2003 paper.)
- Semi-supervised Text
Classification Using EM. Kamal Nigam, Andrew McCallum and Tom
Mitchell. Book chapter in Chapelle, O., Zien, A., and
Scholkopf, B. (Eds.) Semi-Supervised Learning. MIT Press:
Boston. 2006.
(Overview, description, experiments on using expectation maximization
with naive Bayes text classifiers for learning from labeled and
unlabeled data. A chapter in a book about various methods of
semi-supervised learning.)
- Group and Topic Discovery
from Relations and Their Attributes. Xuerui Wang, Natasha Mohanty
and Andrew McCallum. Neural Informaion Processing Systems (NIPS),
2006.
(Social network analysis that simultaneously discovers groups of
entities and also clusters attributes of their relations, such that
clustering in each dimension informs the other. Applied to the voting
records and corresponding text of resolutions from the U.S. Senate and
the U.N., showing that incorporating the votes results in more salient
topic clusters, and that different groupings of legislators emerge from
different topics.)
2005
- A Note on Topical N-grams.
Xuerui Wang and Andrew McCallum. University of Massachusetts Technical
Report UM-CS-2005-071, 2005. (Discover topics
like Latent Dirichlet Allocation, but model phrases
in addition to single words on a per-topic basis. For example, in the
Politics topic, "white house" has special meaning as a colocation,
while in the RealEstate topic, modeling the individual words is
sufficient. Our TNG model produces much cleaner, more interpretable
topics.)
- Pachinko allocation: A Directed Acyclic Graph for Topic
Correlations. Wei Li and Andrew McCallum. NIPS Workshop on
Nonparametric Bayesian Methods, 2005. (Similar
motivations to Blei and Lafferty's Correlated Topic Model (CTM), but
uses a DAG to capture arbitrary and possibly sparse correlations among
topics. Interior nodes of the DAG have a Dirichlet distribution over
their children; words are in the leaves. Provides improved
interpretability and classification, as well as improved held-out
likelihood over CTM. See ICML 2006 paper above.)
- Direct Maximization
of Rank-Based Metrics for Information Retrieval. Don Metzler, W.
Bruce Croft and Andrew McCallum. CIIR Technical Report IR-429, 2005.
- Information Extraction:
Distilling Structured Data from Unstructured Text . Andrew
McCallum. ACM Queue, volume 3, Number 9, November 2005. (An
overview of information extraction by machine learning methods, written
for people not familiar with machine learning, especially CTOs and
other people in business.)
- Learning Clusterwise
Similarity with First-order Features. Aron Culotta and Andrew
McCallum. NIPS Workshop on the Theoretical Foundations of Clustering.
2005. (Discriminatively-trained
graph-partitioning methods for clustering, with features over entire
clusters, including existential and universal quanifiers. Efficiently
instantiate these features only on demand.)
- Composition of
Conditional Random Fields for Transfer Learning.
Charles Sutton and Andrew McCallum. Proceedings of Human Language
Technologies / Emprical Methods in Natural Language Processing
(HLT/EMNLP) 2005. (Improve information extraction
from email data by using the output of another extractor that was
trained on large quantities of newswire. Improve accuracy further by
using joint inference between the two tasks---so that the final target
task can actually affect the output of the intermediate task.)
- Feature Bagging: Preventing
Weight Undertraining in Structured Discriminative Learning.
Charles Sutton, Michael Sindelar, and Andrew McCallum. Center for
Intelligent Information Retrieval, University of Massachusetts
Technical Report IR-402. 2005. (Avoid a common
under-appreciated problem: overly heavy reliance on a few
discriminative features which may not be as reliably present in the
testing data. Discusses four methods of separate training and
combination, and presents statistically-significant
improvements---including new best results on CoNLL-2000 NP Chunking.)
- Fast, Piecewise Training
for Discriminative Finite-state and Parsing Models. Charles Sutton
and Andrew McCallum. Center for Intelligent Information Retrieval
Technical Report IR-403. 2005. (Further results
with "piecewise training", a method also described in a UAI'05 paper
below.)
- Practical Markov
Logic Containing First-order Quantifiers with Application to Identity
Uncertainty. Aron Culotta and Andrew McCallum. Technical Report
IR-430, University of Massachusetts, September 2005. (Use
existental and universal quantifiers in Markov Logic, doing so
practially and efficiently by incrementally instantiating these terms
as needed. Applied to object correspondence, this model combines the
expressivity of BLOG with the predictive accuracy advantages of
conditional probability training. Experiments on citation matching and
author disambiguation.)
- Joint Deduplication of
Multiple Record Types in Relational Data. Aron Culotta and Andrew
McCallum. Fourteenth Conference on Information and Knowledge Management
(CIKM), 2005.
(Longer Tech Report version: A Conditional Model of
Deduplication for Multi-type Relational Data. Technical Report
IR-443, University of Massachusetts, September 2005. (Leverage
relations among multiple entity types to perform coreference
collectively among all types. Uses CRF-style graph partitioning with a
learned distance metric. Experimental results on joint coreference of
both citations and their venues showing that accuracy on both improves.)
- Collective
Multi-Label Classification. Nadia Ghamrawi and Andrew McCallum.
Fourteenth Conference on Information and Knowledge Management (CIKM),
2005. (Multi-label
document classification with a conditional maximum entropy model that
captures not only the traditional dependences between words and the
class labels, but also the coocurrence dependencies between
the class labels. Performs joint inference among all class labels.)
- Predictive
Random Fields: Latent Variable Models Fit by Multiway Conditional
Probability with Applications to Document Analysis. Andrew
McCallum, Xuerui Wang and Chris Pal. UMass Technical Report
UM-CS-2005-053, version 2.1. 2005. (Cluster
structured, relational data, like Latent Dirichlet Allocation and its
successors, but with undirected graphical models that are
conditionally-trained. Improved results over Jebara-inspired synthetic
data, and over the Harmonium as tested on an information retreival
task. This is an evolving Tech Report, which needs to be updated---in
particular we are now referring to this method as "Multi-Conditional
Learning" or "Multi-Conditional Mixtures".)
- Group and Topic
Discovery from Relations and Text.
Xuerui Wang, Natasha Mohanty and Andrew McCallum. KDD Workshop on Link
Discovery: Issues, Approaches and Applications (LinkKDD) 2005. (Social
network analysis that simultaneously discovers groups of entities and
also clusters attributes of their relations, such that clustering in
each dimension informs the other. Applied to the voting records and
corresponding text of resolutions from the U.S. Senate and the U.N.,
showing that incorporating the votes results in more salient topic
clusters, and that different groupings of legislators emerge from
different topics.)
- Detecting Anomalies in
Network Traffic Using Maximum Entropy Estimation. Yu Gu, Andrew
McCallum and Don Towsley. Internet Measurement Conference, 2005. (Build
a density model of normal Internet traffic with Maximum Entropy and
feature induction. Detect network attacks by density threshold.)
- A Conditional Random
Field for Discriminatively-trained Finite-state String Edit Distance.
Andrew McCallum, Kedar Bellare and Fernando Pereira. Conference on
Uncertainty in AI (UAI), 2005. (Train
a string edit distance function from both positive and negative
examples of string pairs (matching and mismatching). Significantly, the
model designer is free to use arbitrary, fancy features of both
strings, and also very flexible edit operations. This model is an
example of an increasingly popular interesting
class---conditionally-trained models with latent variables. Positive
results on citations, addresses and names.)
- Joint Parsing and
Semantic Role Labeling. Charles Sutton and Andrew McCallum. CoNLL
(Shared Task), 2005. (Attempt
to improve accuracy by performing joint inference over parsing and
semantic role labeling---preserving uncertainty and multiple hypotheses
in Dan Bikel's parser. Unfortunately the effort yielded negative
results, most likely because the components needed to produce better
calibrated probabilities.)
- Gene Prediction with
Conditional Random Fields.
Aron Culotta, David Kulp, and Andrew McCallum. Technical Report
UM-CS-2005-028, University of Massachusetts, Amherst, April 2005. (Use
finite-state CRFs to locate introns and exons in DNA sequences. Shows
the advantages of CRFs' ability to straightforwardly incorporate
homology evidence from protein databases.)
- Semi-Supervised
Sequence Modeling with Syntactic Topic Models. Wei Li and Andrew
McCallum. AAAI, 2005. (Learn
a low-dimensional manifold from large quantities of unlabled text data,
then use components of the manifold as additional features when
training a linear-chain CRF with limited labeled data. The manifold is
learned using HMM-LDA [Griffiths, Steyvers, Blei, Tenenbaum 2004], an
unsupervised model with special structure suitable for sequences and
topics. Experimens with English part-of-speech tagging and Chinese word
segmentation.)
- Reducing Labeling
Effort for Structured Prediction Tasks. Aron Culotta and Andrew
McCallum. AAAI, 2005. (A
step toward bringing trainable information extraction to the masses!
Make it easier for end-users to train IE by providing multiple-choice
labeling options, and propagating any constraints their labels provide
on portions of the record-labeling task.)
- Topic and Role Discovery
in Social Networks. Andrew McCallum, Andres Corrada-Emmanuel and
Xuerui Wang. IJCAI, 2005. (Conference
paper version of tech report by same authors in 2004 below. Also
includes new results with Role-Author-Recipient-Topic model. Discover
roles by social network analysis with a Bayesian network that models
both links and text messages exchanged on those links. Experiments with
Enron email and academic email.)
- Piecewise Training for
Undirected Models. Charles Sutton and Andrew McCallum. UAI, 2005. (Efficiently
train a large graphical model in separately normalized pieces, and
amazingly often obtain higher accuracy than without this approximation.
This paper also shows that this piecewise objective is a lower bound on
the exact likelihood, and gives results with three different graphical
model structures.)
- Constrained Kronecker
Deltas for Fast Approximate Inference and Estimation. Chris Pal,
Charles Sutton, Andrew McCallum. Submitted to UAI, 2005. (Sometimes
the graph of the graphical model is not large and complex, but the
cardinality of the variables is large. This paper describes a new and
generalized method for beam search on graphical models, showing
positive experimental results for both inference and training.
Experiments on NetTalk.)
- Multi-Way Distributional
Clustering via Pairwise Interactions. Ron Bekkerman, Ran El-Yaniv
and Andrew McCallum. ICML 2005. (Distributional
clustering in multiple feature dimensions or modalities at once--made
efficient by a factored representation as used in graphical models, and
by a combination of top-down and bottom-up clustering. Results on email
clustering, and new best results on 20 Newsgroups.)
- Disambiguating Web Appearances
of People in a Social Network. Ron Bekkerman and Andrew McCallum.
WWW Conference, 2005. (Find
homepages and other Web pages mentioning particular people. Do a better
job by leveraging a collection of related people.)
2004
- Piecewise Training with
Parameter Independence Diagrams: Comparing Globally- and
Locally-trained Linear-chain CRFs.
Andrew McCallum and Charles Sutton. Center for Intelligent Information
Retrieval, University of Massachusetts Technical Report IR-383.
2004.
(Also presented at NIPS 2004 Workshop on Learning with Structured
Outputs.) (Large
undirected graphical models are expensive to train because they require
global inference to calculate the gradient of the parameters. We
describe a new method for fast training in locally-normalized pieces.
Amazingly the resulting models also give higher accuracy than their
globally-trained counterparts.)
- Automatic
Categorization of Email into Folders: Benchmark Experiments on Enron
and SRI Corpora. Ron Bekkerman, Andrew McCallum and Gary Huang.
UMass CIIR Technical Report IR-418, 2004. (Extensive
experiments on real-world email foldering.)
- The Author-Recipient-Topic
Model for Topic and Role Discovery in Social Networks: Experiments with
Enron and Academic Email.
Andrew McCallum, Andres Corrada-Emmanuel, Xuerui Wang. Technical Report
UM-CS-2004-096, 2004. (Also presented the NIPS'04 Workshop on "
Structured Data and Representations in Probabilistic Models for
Categorization") (Social network analysis that
not only models links between people, but the word content of the
messages exchanged between them. Discovers salient topics guided by the
sender-recipient structure in data, and provides improved ability to
measure role-similarity between people. A generative model in the style
of Latent Dirichlet Allocation.)
- Conditional Models of
Identity Uncertainty with Application to Noun Coreference. Andrew
McCallum and Ben Wellner. Neural Information Processing Systems (NIPS),
2004. (A
model of object consolidation, based on graph partitioning with learned
edge weights. Conference paper version of 2003 work in KDD Workshop on
Data Cleaning.)
- An Integrated,
Conditional Model of Information Extraction and Coreference with
Application to Citation Matching. Ben Wellner, Andrew McCallum,
Fuchun Peng, Michael Hay. Conference on Uncertainty in Artificial
Intelligence (UAI), 2004. (A
conditionally-trained graphical model for identity uncertainty in
relational domains, representing mentions, entities and their
attributes. Also a first example of joint inference for extraction and
identity uncertainty--coreference decisions actually integrate out
uncertainty about information extraction.)
- Collective
Segmentation and Labeling of Distant Entities in Information Extraction.
Charles Sutton and Andrew McCallum. ICML workshop on Statistical
Relational Learning, 2004. (Makes
the boundaries and types of distant segments inter-dependent by
augmenting a linear-chain CRF with additional long, arching edges.
Approximate inference by Tree-Reparameterization.)
- An Exploration of Entity
Models, Collective Classification and Relation Description. Hema
Raghavan, James Allan and Andrew McCallum. KDD Workshop on Link
Analysis and Group Detection, August 2004. (Part
of a student synthesis project: includes an application of RMNs to
classifying people in newswire.)
- Sign Detection in
Natural Images with Conditional Random Fields. Jerod Weinman, Al
Hansen and Andrew McCallum. IEEE International Workshop on Machine
Learning for Signal Processing, 2004. (Part of a
student synthesis project: a grid-shaped CRF with inference by
belief-propagation with Tree-Reparameterization.)
- Extracting Social Networks and
Contact Information from Email and the Web. Aron Culotta, Ron
Bekkerman and Andrew McCallum. Conference on Email and Spam (CEAS)
2004. (Describes
an early version of an end-to-end system that automatically populates
your email address book with a large social network, including
"friends-of-friends," and information about people's expertise.)
- Dynamic Conditional Random
Fields: Factorized Probabilistic Models for Labeling and Segmenting
Sequence Data. Charles Sutton, Khashayar Rohanimanesh and Andrew
McCallum. ICML 2004. (Joint
inference over two traditionally-separate layers of NLP processing:
POS-tagging and NP-chunking. Introduces the CRF analogue of Factorial
HMMs. Compares several approximate inference procedures.)
- Interactive Information
Extraction with Constrained Conditional Random Fields.
Trausti Kristjannson, Aron Culotta, Paul Viola and Andrew McCallum.
Nineteenth National Conference on Artificial Intelligence (AAAI 2004).
San Jose, CA. (Winner of Honorable Mention Award.) (Help
a user interactively correct the results of extraction by providing
uncertainty cues in the UI, and by using constrained Viterbi to
automatically make additional corrections after the first human
correction.)
- Accurate Information
Extraction from Research Papers using Conditional Random Fields.
Fuchun Peng and Andrew McCallum. Proceedings of Human Language
Technology Conference and North American Chapter of the Association for
Computational Linguistics (HLT-NAACL), 2004. (Applies
CRFs to extraction from research paper headers and reference sections,
to obtain current best-in-the-world accuracy. Also compares some simple
regularization methods.)
- Chinese Segmentation and
New Word Detection using Conditional Random Fields.
Fuchun Peng, Fangfang Feng, and Andrew McCallum. Proceedings of The
20th International Conference on Computational Linguistics (COLING
2004) , August 23-27, 2004, Geneva, Switzerland. (State-of-the
art Chinese word segmentation with CRFs, with rich features and many
lexicons; also using confidence estimation to add new words to the
lexicon.)
- Confidence Estimation
for Information Extraction.
Aron Culotta and Andrew McCallum. Proceedings of Human Language
Technology Conference and North American Chapter of the Association for
Computational Linguistics (HLT-NAACL), 2004, short paper. (How to provide not only an answer, but a
formally-justified confidence in that answer--using contrained
forward-backward..)
- A Note on Semi-supervised
Learning using Markov Random Fields. Wei Li and Andrew McCallum.
Technical Note, February 3, 2004. (A
general framework for semi-supervised learning in Conditional Random
Fields, with a focus on learning the distance metric between instances.
Experimental results with collective classification of documents.)
2003
- Dynamic Conditional
Random Fields for Jointly Labeling Multiple Sequences. Andrew
McCallum, Khashayar Rohanimanesh and Charles Sutton. NIPS*2003 Workshop
on Syntax, Semantics, Statistics, 2003. (Workshop
version of ICML 2004 paper.)
- Classification with
Hybrid Generative/Conditional Models. Rajat Raina, Yirong Shen,
Andrew Y. Ng, Andrew McCallum. Proceedings of Neural Information
Processing Systems (NIPS), 2003. (Train some
parameters generatively, some parameters conditionally.)
- Rapid Development of
Hindi Named Entity Recognition Using Conditional Random Fields and
Feature Induction. Wei Li and Andrew McCallum. ACM Transactions on
Asian Language Information Processing, 2003. (How
we developed a named entity recognition system for Hindi in just a few
weeks.)
- A Note on the
Unification of Information Extraction and Data Mining using
Conditional-Probability, Relational Models. Andrew McCallum and
David Jensen. IJCAI'03 Workshop on Learning Statistical Models from
Relational Data, 2003. (Describes
big-picture motivation and approach for research that performs
information extraction and data mining in an integrated fashion, rather
than in two separate serial steps. Lays out a major thrust of my
current research over a multi-year span.)
- Efficiently Inducing
Features of Conditional Random Fields. Andrew McCallum. Conference
on Uncertainty in Artificial Intelligence (UAI), 2003. (CRFs
give you the great power to include the kitchen sink worth of features.
How do you decide which ones to include to avoid over-fitting and
running out of memory? A formal, information-theoretic approach, with
carefully-chosen approximations to make it efficient with millions of
candidate features. This technique key to success in Hindi above, as
well as work by Pereira's group at UPenn)
- Early Results for
Named Entity Recognition with Conditional Random Fields, Feature
Induction and Web-Enhanced Lexicons. Andrew McCallum and Wei Li.
Seventh Conference on Natural Language Learning (CoNLL), 2003. (This is the first publication about named entity
extraction with CRFs.)
- Table Extraction
Using Conditional Random Fields. David Pinto, Andrew McCallum, Xing
Wei and W. Bruce Croft. Proceedings of the ACM SIGIR, 2003. (Application of CRFs to finding tables in government
reports. Uses both language and layout features.)
- Object Consolidation
by Graph Partitioning with a Conditionally-trained Distance Metric.
Andrew McCallum and Ben Wellner. KDD Workshop on Data Cleaning, Record
Linkage and Object Consolidation, 2003. (Later,
improved version of workshop paper immediately below.)
- Toward Conditional
Models of Identity Uncertainty with Application to Proper Noun
Coreference. Andrew McCallum and Ben Wellner. IJCAI Workshop on
Information Integration on the Web, 2003. (A
conditionally-trained model of object consolidation, based on graph
partitioning with learned edge weights.)
- Challenges
in information retrieval and language modeling:
report of a workshop held at the Center for Intelligent Information
Retrieval, University of Massachusetts Amherst. James Allan et al. ACM
SIGIR Forum, Volume 37 Issue 1, April 2003. (A
report about fruitful areas for future work in IR over a five-year time
scale.)
2002
2001
2000
- Learning
to Understand the Web. William Cohen, Andrew McCallum, Dallan
Quass. IEEE
Data Engineering Bulletin. September 2000, Vol. 23, No. 3. Pages
17-24.
- Automating the
Construction of Internet Portals with Machine Learning.
Andrew McCallum, Kamal Nigam, Jason Rennie, Kristie Seymore.
Information Retrieval Journal, volume 3, pages 127-163. Kluwer. 2000.
- Maximum Entropy Markov
Models for Information Extraction and Segmentation. Andrew
McCallum, Dayne Freitag and Fernando Pereira. ICML-2000.
- Efficient Clustering
of High-Dimensional Data Sets with Application to Reference Matching.
Andrew McCallum, Kamal Nigam and Lyle Ungar. KDD-2000.
- Information
Extraction with HMM Structures Learned by Stochastic Optimization.
Dayne Freitag and Andrew McCallum AAAI-2000.
- Creating Customized
Authority Lists. Huan Chang, David Cohn and Andrew McCallum.
ICML-2000.
- Semi-supervised
Clustering with User Feedback. David Cohn, Rich Caruana and Andrew
McCallum. Unpublished manuscript. (Submitted to AAAI 2000)
1999
- Multi-Label Text
Classification with a Mixture Model Trained by EM. Andrew McCallum.
Revised version of paper appearing in AAAI'99 Workshop on Text
Learning.
- A Hierarchical
Probabilistic Model for Novelty Detection in Text. Doug Baker,
Thomas Hofmann, Andrew McCallum and Yiming Yang. Unpublished
manuscript. (Submitted to NIPS'99.)
- Using Maximum
Entropy for Text Classification. Kamal Nigam, John Lafferty, Andrew
McCallum. IJCAI'99 Workshop on Information Filtering.
- Information
Extraction with HMMs and Shrinkage Dayne Frietag and Andrew
McCallum. AAAI'99 Workshop on Machine Learning for Information
Extraction.
- Learning Hidden
Markov Model Structure for Information Extraction Kristie Seymore,
Andrew McCallum, Roni Rosenfeld. AAAI'99 Workshop on Machine Learning
for Information Extraction.
- Building
Domain-Specific Search Engines with Machine Learning Techniques.
Andrew McCallum, Kamal Nigam, Jason Rennie and Kristie Seymore. AAAI-99
Spring Symposium. A related paper
was also accepted to IJCAI'99.
- Using Reinforcement
Learning to Spider the Web Efficiently. Jason Rennie and Andrew
McCallum. ICML'99.
- Bootstrapping for
Text Learning Tasks.
Rosie Jones, Andrew McCallum, Kamal Nigam and Ellen Riloff. IJCAI-99
Workshop on Text Mining: Foundations, Techniques and Applications.
1998
- A Comparison of
Event Models for Naive Bayes Text Classification. Andrew McCallum
and Kamal Nigam. AAAI-98 Workshop on "Learning for Text
Categorization".
- Improving Text
Classification by Shrinkage in a Hierarchy of Classes. Andrew
McCallum, Ronald Rosenfeld, Tom Mitchell and Andrew Ng. ICML-98.
- Employing EM in
Pool-Based Active Learning for Text Classification. Andrew McCallum
and Kamal Nigam. ICML-98.
- Distributional
Clustering of Words for Text Classification. Doug Baker, Andrew
McCallum. SIGIR-98.
- Text Classification
from Labeled and Unlabeled Documents using EM. Kamal Nigam, Andrew
McCallum, Sebastian Thrun and Tom Mitchell. Machine Learning, 39(2/3).
pp. 103-134. 2000.
- Learning to Classify
Text from Labeled and Unlabeled Documents. Kamal Nigam, Andrew
McCallum, Sebastian Thrun and Tom Mitchell. AAAI-98.
- Learning
to Extract Knowledge from the World Wide Web. Mark Craven, Dan
DiPasquo, Dayne Freitag, Andrew McCallum, Tom Mitchell, Kamal Nigam,
Sean Slattery. AAAI-98.
1997
1996
- McCallum, R. Andrew,
Hidden State and Reinforcement Learning with Instance-Based State
Identification, IEEE Transations on Systems, Man and Cybernetics
(Special issue on Robot Learning), 26(3):464--473, 1996.
- McCallum, R. Andrew, Learning
to Use Selective Attention and Short-Term Memory in Sequential Tasks,
in From Animals to Animats, Fourth International Conference on
Simulation of Adaptive Behavior, (SAB'96). Cape Cod, Massachusetts.
September, 1996.
1995
- McCallum, Andrew K.,
Reinforcement Learning with Selective Perception and Hidden State,
PhD. thesis. December, 1995.
- McCallum, R. Andrew,
Instance-Based Utile Distinctions for Reinforcement Learning, The
Proceedings of the Twelfth International Machine Learning Conference
(ML'95), Lake Tahoe, CA, 1995.
- McCallum, R. Andrew,
Instance-Based State Identification for Reinforcement Learning,
Advances in Neural Information Processing Systems (NIPS 7), 1995.
1994
- McCallum, R. Andrew,
First Results with Instance-Based State Identification for
Reinforcement Learning, URCS Tech Report 502, 1994.
- McCallum, R. Andrew,
Reduced Training Time for Reinforcement Learning with Hidden State,
The Proceedings of the Eleventh International Machine Learning Workshop
(Robot Learning), New Brunswick, NJ, 1994.
- McCallum, R. Andrew,
Short-Term Memory in Visual Routines for `Off-Road Car Chasing',
Working Notes of AAAI Spring Symposium Series, "Toward Physical
Interaction and Manipulation", Stanford University, March 21-23, 1994.
1993 and earlier
- McCallum, R. Andrew,
Overcoming Incomplete Perception with Utile Distinction Memory, The
Proceedings of the Tenth International Machine Learning Conference
(ML'93), Amherst, MA, 1993.
- McCallum, R. Andrew,
Learning with Incomplete Selective Perception, Thesis Proposal,
URCS Tech Report 453, 1993.
- Garrett, Scott, Bianchini, Kontothanassis, McCallum,
Thomas, Wisniewski and Luk,
Linking Shared Segments, Winter USENIX, San Diego, CA, 1993.
- McCallum, R. Andrew,
First Results with Utile Distinction Memory for Reinforcement Learning,
URCS Tech Report 446, 1992.
- McCallum, R. Andrew,
Using Transitional Proximity for Faster Reinforcement Learning, The
Proceedings of the Ninth International Machine Learning Conference
(ML'92), Aberdeen, Scotland, 1992.
- Garrett, Bianchini, Kontothanassis, McCallum, Thomas,
Wisniewski and Scott,
Dynamic Sharing and Backward Compatibility on 64-Bit Machines, URCS
Tech Report 418, 1992.
- McCallum, R. Andrew, and Spackman, Kent A.,
Using Genetic Algorithms to Learn Disjunctive Rules from Examples,
The Proceedings of the Seventh International Machine Learning
Conference (ML'90), Austin, Texas, 1990.
|