UDAO: A Next-Generation Unified Data Analytics Optimizer


Big data analytics systems today still lack the ability to take user performance goals and budgetary constraints, collectively referred to as “objectives”, and automatically configure an analytic job to achieve the objectives. This paper presents UDAO, a unified data analytics optimizer that can automatically determine the parameters of the runtime system, collectively called a job configuration, for general dataflow programs based on user objectives. UDAO embodies key techniques including in-situ modeling, which learns a model for each user objective in the same computing environment as the job is run, and multi-objective optimization, which computes a Pareto optimal set of job configurations to reveal tradeoffs between different objectives. Using benchmarks developed based on industry needs, our demonstration will allow the user to explore (1) learned models to gain insights into how various parameters affect user objectives; (2) Pareto frontiers to understand interesting tradeoffs between different objectives and how a configuration recommended by the optimizer explores these tradeoffs; (3) end-to-end benefits that UDAO can provide over default configurations or those manually tuned by engineers.

In PVLDB 12, 12 (2019)
Chenghao Lyu
Chenghao Lyu
fourth-year Ph.D. student

My research interests include big data analytics systems, machine learning and multi-objective optimizations.