by Xiaolan Wang, Alexandra Meliou, Eugene Wu
Abstract:
An increasing number of applications in all aspects of society rely on data. Despite the long line of research in data cleaning and repairs, data correctness has been an elusive goal. Errors in the data can be extremely disruptive, and are detrimental to the effectiveness and proper function of data-driven applications. Even when data is cleaned, new errors can be introduced by applications and users who interact with the data. Subsequent valid updates can obscure these errors and propagate them through the dataset causing more discrepancies. Any discovered errors tend to be corrected superficially, on a case-by-case basis, further obscuring the true underlying cause, and making detection of the remaining errors harder. In this demo proposal, we outline the design of QFix, a query-centric framework that derives explanations and repairs for discrepancies in relational data based on potential errors in the queries that operated on the data. This is a marked departure from traditional data-centric techniques that directly fix the data. We then describe how users will use QFix in a demonstration scenario. Participants will be able to select from a number of transactional benchmarks, introduce errors into the queries that are executed, and compare the fixes to the queries proposed by QFix as well as existing alternative algorithms such as decision trees.
Citation:
Xiaolan Wang, Alexandra Meliou, and Eugene Wu, QFix: Demonstrating error diagnosis in query histories, in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), 2016, pp. 2177–2180 (Demonstration paper).
Bibtex:
@inproceedings{WangMW2016,
author = {Xiaolan Wang and
Alexandra Meliou
and Eugene Wu},
title = {\href{http://people.cs.umass.edu/ameli/projects/queryProvenance/papers/QFix-demo.pdf}{{QFix}: Demonstrating error diagnosis in query histories}},
abstract = {An increasing number of applications in all aspects of society rely on data. Despite the long line of research in data cleaning and repairs, data correctness has been an elusive goal. Errors in the data can be extremely disruptive, and are detrimental to the effectiveness and proper function of data-driven applications. Even when data is cleaned, new errors can be introduced by applications and users who interact with the data. Subsequent valid updates can obscure these errors and propagate them through the dataset causing more discrepancies. Any discovered errors tend to be corrected superficially, on a case-by-case basis, further obscuring the true underlying cause, and making detection of the remaining errors harder.
In this demo proposal, we outline the design of QFix, a query-centric framework that derives explanations and repairs for discrepancies in relational data based on potential errors in the queries that operated on the data. This is a marked departure from traditional data-centric techniques that directly fix the data. We then describe how users will use QFix in a demonstration scenario. Participants will be able to select from a number of transactional benchmarks, introduce errors into the queries that are executed, and compare the fixes to the queries proposed by QFix as well as existing alternative algorithms such as decision trees.},
booktitle = {Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD)},
venue = {SIGMOD},
year = {2016},
pages = {2177--2180},
comment = {Demonstration paper},
doi = {10.1145/2882903.2899388},
}