MLFL Wiki |
Main /
Visual Recognition And Inference Using Dynamic Sparse LearningWe present a hierarchical architecture and learning algorithm for visual recognition and other inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we develop a generative model that includes feedforward, lateral and feedback connections, and enforces sparse coding at all layers. Recent work in machine learning has shown the utility of the generative approach (Hinton et al. 2006), and certain advantages of deep vs shallow networks (such as SVMs) (Bengio and LeCun 2007). I will give a brief history of the attempts to develop multilayer probabilistic generative models, such as the Boltzmann machine and its later variations. In our work, inference on a probabilistic model is approximated by a discrete-time dynamic network. Learning is done with a variant of the backpropagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks and, to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference, and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter. Joseph Murray Joint work with Ken Kreutz-Delgado |