Recent Changes - Search:


University of Massachusetts


Learning Undirected Topic Models

Main.LearningUndirectedTopicModels History

Hide minor edits - Show changes to output

March 22, 2010, at 11:08 AM by -
Added lines 1-19:
Probabilistic topic models are often used to analyze and extract semantic
topics from large text collections. In this talk I will first introduce a
two-layer undirected graphical model, called a Replicated Softmax, that
can be used to model and automatically extract low-dimensional latent
semantic representations from a large unstructured collection of
documents. I will present efficient learning and inference algorithms for
this model, and show how a Monte-Carlo based method, Annealed Importance
Sampling, can be used to produce an accurate estimate of the
log-probability the model assigns to test data. I will further demonstrate
that the proposed model is able to generalize much better compared to
Latent Dirichlet Allocation in terms of both the log-probability of
held-out documents and the retrieval accuracy.
In the second part of the talk I will introduce a class of probabilistic
generative models called Deep Belief Networks that contain many layers of
latent variables with the bottom layer forming a Replicated Softmax model.
I will then show that the resulting deep graphical model is able to both
discover meaningful semantic topics and learn latent representations that
work much better for document retrieval.
Edit - History - Print - Recent Changes - Search
Page last modified on March 22, 2010, at 11:08 AM