Chenghao Lyu

Chenghao Lyu

Ph.D. Candidate

UMass Amherst

About Me

My name is Chenghao Lyu. I am a fifth-year Ph.D. student in the College of Information and Computer Sciences at University of Massachusetts, Amherst, and I am part of the DREAM Lab. I am advised by Prof. Yanlei Diao, and co-advised by Prof. Prashant Shenoy. I am currently visiting the CEDAR project-team of Inria and LIX, at Ecole Polytechnique(X). Before joining UMass Amherst, I got my BS in EE and MS in CS from Fudan University, where I was advised by Prof. X. Sean Wang.

My research interests lie in big data analytics systems, with a focus on the job performance modeling and tuning to optimize multiple system goals (e.g., the total running time and money cost).

Email: {first-name}

  • Big Data Analytics Systems
  • Machine Learning
  • Multi-objective Optimizations
  • Ms/PhD in Computer Science, 2018 - Present

    UMass Amherst, MA, USA

  • MSc in Electronic Engineering, 2018

    Fudan University, Shanghai, China

  • BSc in Electronic Engineering, 2015

    Fudan University, Shanghai, China


[2022.07] Our paper with Alibaba was accepted to VLDB 2022!

[2021.10] I am working as a scientific collabarator in the CEDAR project-team of Inria and LIX, at Ecole Polytechnique.

[2020.10] Our paper was accepted to ICDE 2021!

[2020.02] I started my internship at DAMO Academy.


(2022). Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing. In PVLDB, 15(11), 2022.

PDF Cite DOI Tech Report

(2021). Spark-based Cloud Data Analytics using Multi-Objective Optimization. In ICDE, 2021.

PDF Cite DOI Tech Report

(2021). Neural-based Modeling for Performance Tuning of Spark Data Analytics. arXiv, 2021.

PDF Cite

(2019). UDAO: A Next-Generation Unified Data Analytics Optimizer. In PVLDB 12(12), 2019.



The CEDAR project-team, LIX, Ecole Polytechnique
Scientific Collaborator
Oct 2021 – Present Paris
Developing a Unified Data Analytics Optimizer (UDAO) system.
DAMO Academy, Alibaba
Research Intern
Feb 2020 – Dec 2021 Remote&Hangzhou
Designed the new architecture of a resource optimizer in big data systems. Saved 36-37% latency and 37-75% cost over production workloads of 0.6M jobs and a simulator of the extended Alibaba MaxCompute environment.