Chenyun Wu

Chenyun Wu

PhD student in Computer Vision

CICS, UMass Amherst

Hello! I’m Chenyun Wu (吴晨韵)

I’m a sixth-year PhD student from Computer Vision Lab at CICS UMass Amherst, advised by Prof. Subhransu Maji. Before joining UMass, I obtained my bachelor’s degree from Peking University in 2015 with double majors in physics and computer softwares.

I’m interested in broad topics in computer vision, especially the combination of vision and natural language. I study the joint modeling of visual and language signals, and leverage the supervision of language to further understand various visual domains including fine-grained categories, objects/stuff in images, visual textures, and videos.

I’m expected to graduate in Summer 2021. I’m looking for research positions in the industry.

More details are covered in my resume and research statement.

Recent Updates

  • [07/2020] My paper “Describing Textures using Natural Language” was accepted in ECCV 2020 as an oral paper.
  • [06/2020] I had a research internship at ByteDance in Mountain View working on video and language.
  • [05/2020] I formed my PhD dissertation comittee and passed my dissertation proposal. I target to graduate in 05/2021.
  • [03/2020] My paper “PhraseCut: Language-based Image Segmentation in the Wild” was accepted in CVPR 2020. It’s collaboration work with Adobe Research.

Working Experience


Computer Vision Research Intern


Jun 2020 – Sep 2020 Mountain View, CA, US

I worked with Xiaohui Shen, Xiaojie Jin, and Longyin Wen on localizing clips in videos with natural language descriptions.

  • Reproduced and visualized results from state-of-the-art models (DRN, LGI, CMINS). Analyzed and compared datasets (ActivityNet-Captions, Charades-STA, TACoS).
  • Implemented a graph convolutional net to reason the target clip with language syntactics.
  • Designed an attention mechanism to leverage object detection results to associate frames with nouns in sentences.

Computer Vision Research Intern


May 2018 – Mar 2019 San Jose, CA, US
I worked with Zhe Lin, Scott Cohen, and Trung Bui on large-scale visual grounding. Our work was published on CVPR 2020 as “PhraseCut: Language-based Image Segmentation in the Wild”.

Software Engineering Intern


Jun 2017 – Sep 2017 Mountain View, CA, US

I worked with Nick Johnston, George Toderici, David Minnen, and Michele Covell on deep image compression.

  • Implemented a U-Net image compression model with quantizers and binarizers on skip connections at different levels.
  • Designed and tuned the training procedure to analyze the effectiveness of each skip connection.
  • Enabled different trade-offs between compression size and quality by turning on and off some skip connections.