Latent Dirichlet allocation: Gibbs sampling
Mixtures vs admixtures
Mixture model
where
Admixture model
where and
Model
- tokens
- model components
- any token can be drawn from any model components
Generative process
Note that can be generated from Poisson distributions.
Model explanation
prior
evidence
posterior
Goal:
Gibbs sampling
Use Gibbs sampling to draw samples from :
The denominator is a constant which can be dropped in the inner most loop.
Exploration
likewise,
However, due to the label switching problem, only the single sample with the highest probability should be used to estimate the expectation.
Prediction
where in the third to last line, denotes the value of and denotes the document where is in. In addition, and because test documents do not exist in the training set.
The formula below is applied in the derivation: