Machine Learning and Friends Lunch

Hierarchical Dirichlet processes: A Bayesian approach to sharing clusters among related groups

Abstract

We consider problems involving groups of data, where each observation within a group is drawn from a mixture model, and where it is desirable to share mixture components both within and between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from data. Such problems occur often in practice, e.g. in the problem of topic discovery in document corpora, each document is treated as a group of data items (bag of words), and each topic corresponds to a mixture component. In this setting, sharing components between groups simply means that topics can occur across a number of documents, allowing dependencies across documents (groups) to be modeled effectively as well as conferring generalization to new documents (groups).

The hierarchical Dirichlet process (HDP) is a Bayesian solution to this problem, utilizing both hierarchical and nonparametric modeling ideas. Each group is modeled using a Dirichlet process (DP) mixture, which provides a nonparametric prior for the number of components within each group. To facilitate sharing components between groups, we consider a hierarchical extension where the common base distribution for the DPs is itself distributed according to a DP. Such a base distribution being discrete, the group specific DPs necessarily share atoms, hence the mixtures for different groups necessarily share components.

We discuss a variety of representations for the HDP, as well as Markov chain Monte Carlo algorithms for posterior inference. We report experimental results on three text corpora showing the effective and superior performance of the HDP over previous models.

Back to ML Lunch home