UMass Machine Learning and Friends Lunch | Main / Exploring Patterns In Topical Co-Occurrence

Simple Dirichlet-based statistical topic models make strong assumptions about the cooccurrence of topics. Clearly, however, some combinations of topics are more likely than others. In this talk, we examine two methods of learning models that represent such cooccurrence patterns. The first is a member of the Pachinko Allocation family, hPAM. In this model, each word is generated by choosing a path through a DAG, in which every node has a distribution over words. Thus, this model explicitly represents clusters of topics and the words that specifically accompany those clusters. The second model takes a different approach to learning topic combinations. This model, Author-Persona-Topic (APT) assumes that each author writes under multiple "personas", where every document in a given persona shares a single topic distribution. We show that this model is effective in expert retrieval applications, such as recommending reviewers for conference papers.