Latent Dirichlet allocation: Gibbs sampling
Mixtures vs admixtures
Mixture model
where
Admixture model
where
and
Model
tokens
model components
- any token can be drawn from any model components
Generative process
Note that can be generated from Poisson distributions.
Model explanation
prior
evidence
posterior
Goal:
Gibbs sampling
Use Gibbs sampling to draw samples from :
The denominator is a constant which can be
dropped in the inner most loop.
Exploration
likewise,
However, due to the label switching problem, only the single sample with the highest probability should be used to estimate the expectation.
Prediction
where in the third to last line, denotes the value of
and
denotes the document where
is in.
In addition,
and
because test documents do not exist in the training set.
The formula below is applied in the derivation: