Juan Zhai

Assistant Professor

Email: juanzhai at umass dot edu

Office: A211D LGRC

Manning College of Information & Computer Sciences
University of Massachusetts, Amherst
140 Governors Drive, Amherst, MA 01003-9264, USA

Juan Zhai is an Assistant Professor in the Manning College of Information & Computer Sciences (CICS) at University of Massachusetts, Amherst (UMass). She co-directs the Laboratory for Advanced Software Engineering Research (LASER) lab. She is also a member of UMass NLP group. She is broadly interested in Software Engineering and Program Languages, Natural Language Processing and Software Text Analytics, Software Reliability and Security, and Deep Learning.

I am always looking for students to work with. Drop me an email if you are interested in working with me!


Research Areas

Formal Specification Synthesis

The fundamental challenge in developing high-quality software is ensuring that its behavior aligns with the intended specifications. At the heart of this challenge is the persistent absence of formal specifications which are precise, unambiguous definitions of expected behavior. Formal specifications are essential not only for traditional activities like debugging, testing, and verification, but also for enabling large language model agents to generate correct and reliable code. Despite their importance, formal specifications remain rare in practice due to the complexity of authoring and maintaining them, especially in fast-evolving, real-world systems.

My research addresses this gap by developing techniques to automatically synthesize formal specifications. Our tool C2S, successfully translates natural language comments into formal specifications, improving test oracle generation, reducing false positives in automated testing, and enhancing static taint analysis. We have also generate LTL formulas for IoT systems from natural language commands, and are now exploring LLM-driven specification synthesis.

Specification Inference LLM-driven Synthesis IoT Systems

Comment Generation and Maintenance

Comments often convey high-level semantic intent, but they are frequently outdated, incomplete, or inconsistent with code. Our research focuses on inferring and maintaining behaviorally accurate comments to support understanding, maintenance, and reasoning.

We developed CPC, a novel software reasoning method that enables bidirectional analysis across comments and code implementation. To keep comments updated as software evolves, we developed LLMCup, a framework that automatically updates comments using large language models with a ranking-based refinement strategy.

Code-Comment Analysis Automatic Updates LLM Integration

Trustworthy AI Analysis and Improvement

As AI-driven software systems increasingly impact critical aspects of society, ensuring their trustworthiness has become both a technical and moral imperative. Our research spans the AI stack, with a focus on improving correctness, robustness, and fairness through practical, scalable tools and techniques.

Deep Learning Frameworks Testing

We develop tools to test and enhance the reliability of core AI infrastructure, including ModelMeta, DevMuT, DLJSFuzzer, and Citadel, addressing bugs and inefficiencies in popular deep learning frameworks.

Training Diagnosis and Repair

To prevent wasteful or faulty training runs, we create tools such as AutoTrainer and Dream that proactively detect and repair training issues before models are fully trained.

Bias Detection and Mitigation in ML Systems

We design automated methods to detect, explain, and mitigate bias across the ML lifecycle, from training and pruning to fine-tuning. Our work addresses biases in linguistic style, prompt language and provider preferences.

Bias Mitigation Framework Testing Training Repair AI Safety

Current PhD Students

Gehao Zhang

PhD Student in Computer Science

Research Focus: Software Engineering, AI Safety

1 PhD Student Active Research Recruiting

Join Our Research Group

For Prospective PhD Students

I am actively seeking motivated PhD students interested in Software Engineering, AI Safety, and Trustworthy AI research. Strong background in programming and interest in research is essential.

Contact: Please email me with your CV, transcript, and research interests.

Application: Apply through the UMass CS PhD program and mention my name in your application.

For Undergraduate and Master's Students

I welcome undergraduate and Master's students interested in research opportunities, independent studies, thesis projects, visiting opportunities, or internships throughout the year.

Contact: Please email me with your CV, transcript, and research interests.

Research Areas Available

  • • Formal Specification Synthesis
  • • Comment Generation and Maintenance
  • • AI/ML System Testing and Debugging
  • • Bias Detection and Mitigation
  • • Software Engineering for AI

💡 I am always looking for students to work with!

Drop me an email at juanzhai at umass dot edu if you are interested in working with me!

PhD Positions Available Undergraduate Research Master's Thesis

Publication Overview

Course History

Graduate Level Courses

CS692P

Hot Topics in Software Engineering Research

Fall 2024
CS520

Theory and Practice of Software Engineering

Fall 2024, Spring 2024

Undergraduate Level Courses

CS431

Software Engineering

Spring 2023, Spring 2020
CS112

Data Structure

Multiple Semesters
CS111

Introduction to Computer Science

Multiple Semesters

Complete Teaching Timeline

Hot Topics in Software Engineering Research (CS692P) Fall 2024
Theory and Practice of Software Engineering (CS520) Fall 2024
Theory and Practice of Software Engineering (CS520) Spring 2024
Software Engineering (CS431) Spring 2023
Introduction to Computer Science (CS111) Spring 2023
Data Structure (CS112) ×2 Fall 2022
Introduction to Computer Science (CS111) Fall 2022
Data Structure (CS112) ×2 Spring 2022
Data Structure (CS112) ×2 Spring 2021
Introduction to Computer Science (CS111) Spring 2021
Data Structure (CS112) Fall 2020
Introduction to Computer Science (CS111) Fall 2020
Software Engineering (CS431) Spring 2020
Data Structure (CS112) Spring 2020
Introduction to Computer Science (CS111) Spring 2020
Data Structure (CS112) Fall 2019
Introduction to Computer Science (CS111) Fall 2019

Note: ×2 indicates I taught two separate sections of the course that term

5+ Years Teaching 5 Different Courses 20 Course Offerings

Conference Program Committee & Reviewing

Software Engineering Conferences

ICSE

International Conference on Software Engineering

2026, 2025
FSE

Symposium on the Foundations of Software Engineering

2025, 2024
ISSTA

International Symposium on Software Testing and Analysis

2025, 2024
ASE

International Conference on Automated Software Engineering

2025, 2024, 2023
ASE NIER

New Ideas and Emerging Results Track

2025
PROMISE

Predictive Models and Data Analytics in Software Engineering

2025
AST

International Conference on Automation of Software Test

2023
ICSE-Demo

ICSE Demo Track

2022

AI/ML Conferences

CVPR

Conference on Computer Vision and Pattern Recognition

2024, 2023
ICLR

International Conference on Learning Representations

2025, 2024
NeurIPS

Conference on Neural Information Processing Systems

2024, 2023
ICML

International Conference on Machine Learning

2022
12+ SE Conferences 4 AI/ML Conferences 30+ Reviews

Journal Editorial & Reviewing

TOSEM

Transactions on Software Engineering and Methodology

2024, 2023, 2022, 2021
TSE

Transactions on Software Engineering

2020
JSS

Journal of Systems and Software

2024, 2023, 2019
JOS

Journal of Software

2020
EMSE

Empirical Software Engineering

2019
5 Major Journals 10+ Reviews Multi-year Commitment

Service Impact Summary

16+
Total Conferences
5
Journals
6
Years Active
40+
Total Reviews
Top Venues Consistent Service Cross-Disciplinary