UMass Machine Learning and Friends Lunch | Main / Representation Reasoning And Never-Ending Learning For Scene Understanding

Abstract

One of the long-term goals of my research is to build vision systems that develop deep understanding of the visual world from images and videos. In this talk, I will discuss systems that can not only do categorization but can also extract the physical and function meaning of objects. Specifically, I will first talk about a new mid-level representation that is based on visual primitives that are both discriminative and geometrically/semantically informative. I will show how these primitives can be automatically discovered from large amounts of visual data. Next, I will talk about reasoning approaches which use primitives to build a global and coherent understanding of an image. These reasoning approaches are based on physical, functional and causal relationships between the different elements in the scene. Finally, I will give a brief demo of our new work on NEIL (Never Ending Image Learner). NEIL is a computer program that runs 24 hours per day and 7 days per week to automatically extract visual knowledge from images downloaded from the Internet. Since its birth, NEIL has been running for 400K CPU hours (4 months). During this period, NEIL has labeled more than half a million images and learned 1800 common sense relationships.

Bio

Abhinav Gupta is an Assistant Research Professor at the Robotics Institute, Carnegie Mellon University. Prior to this, he was a postdoctoral fellow at CMU working with Alexei Efros and Martial Hebert. His research is in the area of computer vision, and its applications to robotics and computer graphics. He is particularly interested in building vision systems that develop a deep understanding of the visual world from images and videos. His research has focused on exploiting big visual data for developing visual representation, reasoning and learning common sense knowledge. His other research interests include exploiting relationship between language and vision, semantic image parsing, and exemplar-based models for recognition. Abhinav received his PhD in 2009 from the University of Maryland under Prof. Larry Davis. His dissertation was also nominated for the ACM Doctoral Dissertation Award by the University of Maryland. Abhinav is a recipient of the University of Maryland Dean's Fellowship Award (2004), Google Faculty Research Award (2012), Bosch Faculty Award (2013) and the ECCV Best Paper Runner-up Award (2010).