UMass Machine Learning and Friends Lunch | Main / Introduction To Deep Belief Nets

In the 1980's, new learning algorithms for neural networks promised to solve difficult classification tasks, like speech or object recognition, by learning many layers of non-linear features. The results were disappointing for two reasons: There was never enough labeled data to learn millions of complicated features and the learning was much too slow in deep neural networks with many layers of features. These problems can now be overcome by learning one layer of features at a time and by changing the goal of learning. Instead of trying to predict the labels, the learning algorithm tries to create a generative model that produces data which looks just like the unlabeled training data. These new neural networks outperform other machine learning methods when labeled data is scarce but unlabeled data is plentiful. An analogy to brain extracting multiple layers of representation will be made. Applications to vision and document retrieval will also be described.