Juan Zhai - Academic Homepage

Juan Zhai is an Assistant Professor in the Manning College of Information & Computer Sciences (CICS) at University of Massachusetts, Amherst (UMass). She co-directs the Laboratory for Advanced Software Engineering Research (LASER) lab. She is also a member of UMass NLP group. She is broadly interested in Software Engineering and Program Languages, Natural Language Processing and Software Text Analytics, Software Reliability and Security, and Deep Learning.

I am always looking for students to work with. Drop me an email if you are interested in working with me!

Formal Specification Synthesis

The fundamental challenge in developing high-quality software is ensuring that its behavior aligns with the intended specifications. At the heart of this challenge is the persistent absence of formal specifications which are precise, unambiguous definitions of expected behavior. Formal specifications are essential not only for traditional activities like debugging, testing, and verification, but also for enabling large language model agents to generate correct and reliable code. Despite their importance, formal specifications remain rare in practice due to the complexity of authoring and maintaining them, especially in fast-evolving, real-world systems.

My research addresses this gap by developing techniques to automatically synthesize formal specifications. Our tool C2S, successfully translates natural language comments into formal specifications, improving test oracle generation, reducing false positives in automated testing, and enhancing static taint analysis. We have also generate LTL formulas for IoT systems from natural language commands, and are now exploring LLM-driven specification synthesis.

Comment Generation and Maintenance

Comments often convey high-level semantic intent, but they are frequently outdated, incomplete, or inconsistent with code. Our research focuses on inferring and maintaining behaviorally accurate comments to support understanding, maintenance, and reasoning.

We developed CPC, a novel software reasoning method that enables bidirectional analysis across comments and code implementation. To keep comments updated as software evolves, we developed LLMCup, a framework that automatically updates comments using large language models with a ranking-based refinement strategy.

Trustworthy AI Analysis and Improvement

As AI-driven software systems increasingly impact critical aspects of society, ensuring their trustworthiness has become both a technical and moral imperative. Our research spans the AI stack, with a focus on improving correctness, robustness, and fairness through practical, scalable tools and techniques.

Deep Learning Frameworks Testing

We develop tools to test and enhance the reliability of core AI infrastructure, including ModelMeta, DevMuT, DLJSFuzzer, and Citadel, addressing bugs and inefficiencies in popular deep learning frameworks.

Training Diagnosis and Repair

To prevent wasteful or faulty training runs, we create tools such as AutoTrainer and Dream that proactively detect and repair training issues before models are fully trained.

Bias Detection and Mitigation in ML Systems

We design automated methods to detect, explain, and mitigate bias across the ML lifecycle, from training and pruning to fine-tuning. Our work addresses biases in linguistic style, prompt language and provider preferences.

Current Students

PhD Students (Advisor)

Gehao Zhang

2nd year PhD student

PhD Students (Committee Member)

Juan Altmayer Pizzorno

5th year PhD student

Zhanna Kaufman

5th year PhD student

Master's Students

Po-Hsiang Wang

1st year Master's student

Undergraduate Students

Benson Zheng

Senior undergraduate student

Sheyan Yu

Senior undergraduate student

Join Our Research Group

For Prospective PhD Students

I am actively seeking motivated PhD students interested in Software Engineering, AI Safety, and Trustworthy AI research. Strong background in programming and interest in research is essential.

Application: Apply through the UMass CS PhD program and mention my name in your application.

For Undergraduate and Master's Students

I welcome motivated undergraduate and Master's students to join our research activities. Research experience provides valuable skills and insights for your academic and professional development.

Opportunities: Independent study courses, research assistant positions, and co-authorship on publications.

📧 Contact: Please email me with your CV, transcript, and research interests.

Loading publications...

Graduate Level Courses

CS692P

Hot Topics in Software Engineering Research

Fall 2024

CS520

Theory and Practice of Software Engineering

Fall 2025, Fall 2024, Spring 2024

Undergraduate Level Courses

CS431

Software Engineering

Spring 2023, Spring 2020

CS112

Data Structure

Multiple Semesters

CS111

Introduction to Computer Science

Multiple Semesters

Complete Teaching Timeline

Theory and Practice of Software Engineering (CS520) Fall 2025

Hot Topics in Software Engineering Research (CS692P) Fall 2024

Theory and Practice of Software Engineering (CS520) Fall 2024

Theory and Practice of Software Engineering (CS520) Spring 2024

Software Engineering (CS431) Spring 2023

Introduction to Computer Science (CS111) Spring 2023

Data Structure (CS112) ×2 Fall 2022

Introduction to Computer Science (CS111) Fall 2022

Data Structure (CS112) ×2 Spring 2022

Data Structure (CS112) ×2 Spring 2021

Introduction to Computer Science (CS111) Spring 2021

Data Structure (CS112) Fall 2020

Introduction to Computer Science (CS111) Fall 2020

Software Engineering (CS431) Spring 2020

Data Structure (CS112) Spring 2020

Introduction to Computer Science (CS111) Spring 2020

Data Structure (CS112) Fall 2019

Introduction to Computer Science (CS111) Fall 2019

Note: ×2 indicates I taught two separate sections of the course that term

Conference Program Committee & Reviewing

Software Engineering Conferences

ICSE (International Conference on Software Engineering) 2026, 2025

FSE (Symposium on the Foundations of Software Engineering) 2025, 2024

ISSTA (International Symposium on Software Testing and Analysis) 2026, 2025, 2024

ASE (International Conference on Automated Software Engineering) 2025, 2024, 2023

FSE VIR (Ideas, Visions, and Reflections Track) 2026

ASE NIER (New Ideas and Emerging Results Track) 2025

PROMISE (Predictive Models and Data Analytics in Software Engineering) 2025

LLM4Code (International Workshop on Large Language Models for Code) 2026

AST (International Conference on Automation of Software Test) 2023

ICSE-Demo (ICSE Demo Track) 2022

AI/ML Conferences

CVPR (Conference on Computer Vision and Pattern Recognition) 2024, 2023

ICLR (International Conference on Learning Representations) 2026, 2025, 2024

NeurIPS (Conference on Neural Information Processing Systems) 2024, 2023

ICML (International Conference on Machine Learning) 2022

Journal Editorial & Reviewing

TOSEM (Transactions on Software Engineering and Methodology) 2024, 2023, 2022, 2021

TSE (Transactions on Software Engineering) 2020

JSS (Journal of Systems and Software) 2024, 2023, 2019

JOS (Journal of Software) 2020

EMSE (Empirical Software Engineering) 2019