**GANORCON: Are Generative Models Useful for Few-shot Segmentation?**
  [Oindrila Saha](http://oindrilasaha.github.io), [Zezhou Cheng](http://sites.google.com/site/zezhoucheng/), [Subhransu Maji](http://people.cs.umass.edu/~smaji/)
  _University of Massachusetts - Amherst_

!!!
   <span style="color: red"> ==> **Leaderboard Coming Soon!!** </span>
 
![Figure [ganorcon_splash]: **GANorCON - A comparison of GAN and contrastive learning (CON) approaches for few-shot part segmentation.** A) Both approaches consist of three steps. **Step I:** Train a GAN decoder for generating images or a contrastive learning based image encoder. **Step II:** Train a projector given a few labeled examples using hypercolumn representations from the decoder or encoder. **Step III:** For efficient inference, distill the GAN sampled images and their labels to an off-the-shelf feed-forward segmentation model. This step is optional for CON as the model is feed-forward. B) CON representations achieve better performance while being nearly an order-of-magnitude faster to train than GAN representations on several datasets. C) CON representations are significantly worse when trained on GAN generated images. D) CON representations are effective at fine-grained part segmentation tasks - the figure shows the output of our method trained with 16 labeled faces.](./ganvscon-v7-7.png)
 
Advances in generative modeling based on GANs has motivated the community to find their use beyond
 image generation and editing tasks. In particular, several recent works have shown that GAN
 representations can be re-purposed for discriminative tasks such as part segmentation, especially
 when training data is limited. But how do these improvements stack-up against recent advances in
 self-supervised learning? Motivated by this we present an alternative approach based on contrastive
 learning and compare their performance on standard few-shot part segmentation benchmarks. Our
 experiments reveal that not only do the GAN-based approach offer no significant performance
 advantage, their multi-step training is complex, nearly an order-of-magnitude slower, and can
 introduce additional bias. These experiments suggest that the inductive biases of generative models,
 such as their ability to disentangle shape and texture, are well captured by standard feed-forward
 networks trained using contrastive learning.
 
PUBLICATION
==========================================================================================
**GANORCON: Are Generative Models Useful for Few-shot Segmentation?** <br> 
Oindrila Saha, Zezhou Cheng, Subhransu Maji <br>
IEEE Computer Vision and Pattern Recognition (CVPR), 2022. <br>
[[arXiv](https://arxiv.org/pdf/2112.00854.pdf)]
 
CODE
===============================================================================
The code for reproducing our results alongwith pretrained models is available [here](https://github.com/oindrilasaha/GANorCON)
 
RESULTS
===============================================================================
![Figure [qualitative results]: Visualization of some results on the faces and cars datasets](./viz_for_webpage.jpeg)

**Table 1**: *Comparison of our method with previous approaches. (Please refer to the paper for more details)*

  Method              |  Face34  |Face34 weighted | Face8   | Car20     | Cat16
:--------------------:|:--------:|:--------------:|:-------:|:---------:|:----------:
  Transfer Learning   |   45.77  | -              | 62.83   | 33.91     | 22.52
  Semi-Supervised     |   48.17  | -              | 63.36   | 44.51     | 30.15
  DatasetGAN          |   53.65  | 79.59          | 69.71   | 68.40     | 30.83
  Ours - CONV+distill | **54.06**| **82.41**      |**70.48**| **69.56** | **31.01**  


ACKNOWLEDGEMENTS
===============================================================================
The project is supported in part by Grant 
#1749833 from the National Science Foundation of United
States. Our experiments were performed on the University
of Massachusetts Amherst GPU cluster obtained under the
Collaborative Fund managed by the Mass. Technology Collaborative.

**Cite us:**
(embed bib.txt height=115px here)

<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="markdeep.min.js"></script><script src="https://casual-effects.com/markdeep/latest/markdeep.min.js?"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>