About
My name is Simeng Sun (孙思萌), I am a Ph.D. candidate in Computer Science advised by Prof. Mohit Iyyer. My thesis is about "Understanding the Role of Context in Neural Language Models". I am broadly interested in language modeling, machine translation, and in general, text generation. In summer 2021 and 2022, I interned at FAIR Accel where I mainly worked on multilingual machine translation. In summer 2020, I did my internship at Adobe DIL, working on AI assisted writing. Before coming to UMass, I worked closely with Prof. Ani Nenkova while I was a master's student at UPenn.
(CV),(Twitter), (Linkedin)
Education
Ph.D. student in Computer Science, UMass Amherst Aug. 2019 - now
M.S.E in Computer and Information Science, UPenn Aug. 2017 - May 2019
B.E. in Computer Science and Technology, Beihang University Sep. 2013 - Jun. 2017
Exchange student, Trinity College Dublin Sep. 2015 - Jan. 2016
Research
Publications (Google Scholar) (Semantic Scholar)
Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages
Simeng Sun, Maha Elbayad, Anna Sun, James Cross
Conference of the European Chapter of the Association for Computational Linguistics (EACL), long, 2023
ChapterBreak: A Challenge Dataset for
Long-Range Language Models
Simeng Sun, Katherine Thai, Mohit Iyyer
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short, 2022
How Much Do Modifications to Transformer Language Models Affect
Their Ability to Learn Linguistic Knowledge?
Simeng Sun, Brian Dillon, Mohit Iyyer
Workshop on Insights from Negative Results in NLP @ ACL 2022
Alternative Input Signals Ease Transfer in Multilingual Machine Translation
Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman
Annual Meeting of the Association for Computational Linguistics (ACL), long, 2022
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun, Kalpesh Krishna, Andrew Mattarella-Micke, and Mohit Iyyer
Empirical Methods in Natural Language Processing (EMNLP), long, 2021
IGA : An Intent-Guided Authoring Assistant
Simeng Sun, Wenlong Zhao, Varun Manjunatha, Rajiv Jain, Vlad Morariu, Franck Dernoncourt, Balaji Vasan Srinivasan, Mohit Iyyer
Empirical Methods in Natural Language Processing (EMNLP), long, 2021
Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models.
Sumanta Bhattacharyya, Pedram Rooshenas, Subhajit Naskar, Simeng Sun, Mohit Iyyer, and Andrew McCallum.
Annual Meeting of the Association for Computational Linguistics (ACL), long, 2021
Revisiting Simple Neural Probabilistic Language Models
Simeng Sun, Mohit Iyyer
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short, 2021
Hard-Coded Gaussian Attention for Neural Machine Translation
Weiqiu You*, Simeng Sun*, Mohit Iyyer
Annual Meeting of the Association for Computational Linguistics (ACL), long, 2020
The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization Simeng Sun, Ani Nenkova Empirical Methods in Natural Language Processing (EMNLP), short, 2019
How to Compare Summarizers without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature Simeng Sun, Ori Shapira, Ido Dagan, Ani Nenkova North American Chapter of the Association for Computational Linguistics (NAACL-HLT), NeuralGen Workshop, 2019
Name Disambiguation for Chinese Scientific Authors with Multi-Level Clustering Simeng Sun, Hui Zhang, Ning Li, Yong Chen IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), 2017
Manuscripts
How Does In-Context Learning Help Prompt Tuning?
Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, Mohit Iyyer
2023 Feb
Teaching
CS685: Advanced natural language processing @ UMass Amherst, Fall 2020 (TA) CS585: Introduction to Natural Language Processingat @ UMass Amherst, Fall 2019 (TA) CIS520: Machine Learning @ UPenn, Spring 2018/2019 (TA) CIS421/521: Artificial Intelligence @ UPenn, Fall 2018 (TA)
Contact
Email: simengsun AT umass DOT edu Address: CS266, UMass Amherst
Misc.
I enjoy listening to Lexicon Valley in my spare time. I am a tech volunteer for the OTW Open Doors Project.