About
Juan Zhai is an Assistant Professor in the Manning College of Information & Computer Sciences (CICS) at University of Massachusetts, Amherst (UMass). She co-directs the Laboratory for Advanced Software Engineering Research (LASER) lab. She is also a member of UMass NLP group. She is broadly interested in Software Engineering and Program Languages, Natural Language Processing and Software Text Analytics, Software Reliability and Security, and Deep Learning.
Contact Information
juanzhai at umass dot edu
Office
A211D LGRC
Research
- Formal specification synthesis: One of the main challenges in developing high-quality software is ensuring that its behavior aligns with the intended specifications. Formal specifications can significantly enhance the processes of debugging, testing, and verification. However, their adoption is limited due to the extensive manual effort, potential for errors, and the specialized expertise required. A promising approach to overcome this challenge is the automatic generation of formal specifications from the rich natural language artifacts that commonly describe the expected behavior of software systems. Our work C2S successfully translates natural language comments into formal specifications, which are proved to be helpful in generating new test oracles, reducing false alarms in automated testing, and improving static taint analysis. In addition to program specifications, we also successfully generate Linear Temporal Logic (LTL) formulas for IoT systems from user-defined commands in natural language. We are currently exploring the use of large language models to generate formal specifications.
- Comment generation and maintenance: Natural language artifacts, such as comments, user manuals and API documentation, provide more human-interpretable descriptions of a program. These artifacts facilitate code comprehension and software maintenance by providing abundant information that can be leveraged for software development and maintenance. Developers are less motivated to write and update comments (compared to writing functional code), making comment-driven analysis challenging or even infeasible. We developed a novel software reasoning method, CPC, that enables bidirectional analysis across comments and code implementation for the first time: (1) program analysis propagates and updates comments; and (2) comments provide additional semantic hints to enrich program analysis. Our proposed fine-grained comment taxonomy in this work has been well-adopted in the community. Leveraging advancements in large language models, we are now generating comments using these models.
- Trustworthy AI Ecosystems: In an era where artificial intelligence systems increasingly influence every aspect of society, enhancing the trustworthiness of AI models has become crucial. This is not just a technical challenge but a moral one. Our work spans the entire AI stack, from testing the general frameworks, execution environment, to test and automatically repair model bugs. The work has also been expanded to domain specific areas, e.g., machine translation. We focus on key properties, e.g., robustness and fairness, from training, pruning, to finetuning.
Students
- Current PhD Students: Gehao Zhang, Jiasheng Gu
- Drop me an email if you are interested in working with me!
Services
- TPC/Reviewer, International Conference on Software Engineering (ICSE), 2025
- TPC/Reviewer, Symposium on the Foundations of Software Engineering (FSE), 2025, 2024
- TPC/Reviewer, International Symposium on Software Testing and Analysis (ISSTA), 2025, 2024
- TPC/Reviewer, International Conference on Automated Software Engineering (ASE), 2024, 2023
- TPC/Reviewer, Conference on Computer Vision and Pattern Recognition (CVPR), 2024, 2023
- TPC/Reviewer, International Conference on Learning Representations (ICLR), 2025, 2024
- TPC/Reviewer, Conference on Neural Information Processing Systems (NeurIPS), 2024, 2023
- TPC/Reviewer, International Conference on Automation of Software Test (AST), 2023
- TPC/Reviewer, International Conference on Software Engineering, Demo Track (ICSE-Demo), 2022
- TPC/Reviewer, International Conference on Machine Learning (ICML), 2022
- TPC/Reviewer, Transactions on Software Engineering and Methodology (TOSEM), 2024, 2023, 2022, 2021
- TPC/Reviewer, Transactions on Software Engineering (TSE), 2020
- TPC/Reviewer, Journal of Systems and Software (JSS), 2024, 2023, 2019
- TPC/Reviewer, Journal of Software (JOS), 2020
- TPC/Reviewer, Empirical Software Engineering (EMSE), 2019
Teaching
- Fall 24: Hot Topics in Software Engineering Research (CS692P)
- Fall 24: Theory and Practice of Software Engineering (CS520)
- Spring 24: Theory and Practice of Software Engineering (CS520)
- Spring 23: Software Engineering (CS431)
- Spring 23: Introduction to Computer Science (CS111)
- Fall 22: Data Structure (CS112)
- Fall 22: Introduction to Computer Science (CS111)
- Spring 22: Data Structure (CS112)
- Spring 21: Data Structure (CS112)
- Spring 21: Introduction to Computer Science (CS111)
- Fall 20: Data Structure (CS112)
- Fall 20: Introduction to Computer Science (CS111)
- Spring 20: Software Engineering (CS431)
- Spring 20: Data Structure (CS112)
- Spring 20: Introduction to Computer Science (CS111)
- Fall 19: Data Structure (CS112)
- Fall 19: Introduction to Computer Science (CS111)