A Fuller Understanding Of Fully Convolutional Networks
Fully convolutional networks are end-to-end, pixels-to-pixels architectures for image-to-image tasks such as semantic segmentation, surface normal estimation, and colorization. By addressing the whole task these networks can do image-to-image mapping with efficient inference and learning. Casting classification networks into fully convolutional form brings transfer learning to pixelwise problems for accuracy and data efficiency. Fusing features across layers yields a multi-resolution, joint model of what and where to refine the output. Fully convolutional networks take a step in the direction of a unified, learned approach to pixel prediction. In this talk I'll cover the definition and operation of fully convolutional networks, examine new experiments that inform the care and feeding of these models, and highlight extensions to weak supervision, structured output, and video.
Evan Shelhamer is a PhD student at UC Berkeley advised by Trevor Darrell as a member of the Berkeley AI Research. His research is on deep learning and end-to-end optimization for vision. Before switching coasts, he studied computer science (AI concentration) and psychology at University of Massachusetts Amherst. He is the lead developer of the Caffe deep learning framework and takes his coffee black.