Why So? or Why No? Functional Causality for Explaining Query Answers
by Alexandra Meliou, Wolfgang Gatterbauer, Katherine F. Moore, Dan Suciu
Abstract:
In this paper, we propose causality as a unified framework to explain query answers and non-answers, thus generalizing and extending several previously proposed definitions of provenance and missing query result explanations. Starting from the established definition of actual causes by Halpern and Pearl [12], we propose functional causes as a refined definition of causality with several desirable properties. These properties allow us to apply our notion of causality in a database context and apply it uniformly to define the causes of query results and their individual contributions in several ways: (i) we can model both provenance as well as non-answers, (ii) we can define explanations as either data in the input relations or relational operations in a query plan, and (iii) we can give graded degrees of responsibility to individual causes, thus allowing us to rank causes. In particular, our approach allows us to explain contributions to relational aggregate functions and to rank causes according to their respective responsibilities, aiding users in identifying errors in uncertain or untrusted data. Throughout the paper, we illustrate the applicability of our framework with several examples. This is the first work that treats "positive" and "negative" provenance under the same framework, and establishes the theoretical foundations of causality theory in a database context.
Citation:
Alexandra Meliou, Wolfgang Gatterbauer, Katherine F. Moore, and Dan Suciu, Why So? or Why No? Functional Causality for Explaining Query Answers, in Proceedings of the 4th International VLDB workshop on Management of Uncertain Data (MUD) in conjunction with VLDB, 2010, pp. 3–17.
Bibtex:
@inproceedings{DBLP:conf/mud/MeliouGMS10,
    Abstract = {In this paper, we propose causality as a unified framework to
    explain query answers and non-answers, thus generalizing and extending
    several previously proposed definitions of provenance and missing query
    result explanations. Starting from the established definition of actual
    causes by Halpern and Pearl [12], we propose functional causes as a
    refined definition of causality with several desirable properties. These
    properties allow us to apply our notion of causality in a database context
    and apply it uniformly to define the causes of query results and their
    individual contributions in several ways: (i) we can model both provenance
    as well as non-answers, (ii) we can define explanations as either data in
    the input relations or relational operations in a query plan, and (iii) we
    can give graded degrees of responsibility to individual causes, thus
    allowing us to rank causes. In particular, our approach allows us to
    explain contributions to relational aggregate functions and to rank causes
    according to their respective responsibilities, aiding users in
    identifying errors in uncertain or untrusted data. Throughout the paper,
    we illustrate the applicability of our framework with several examples.
    This is the first work that treats "positive" and "negative" provenance
    under the same framework, and establishes the theoretical foundations of
    causality theory in a database context.},
    Author = {Alexandra Meliou and Wolfgang Gatterbauer and Katherine F. Moore and Dan Suciu},
    Booktitle = {Proceedings of the 4th International VLDB workshop on Management of Uncertain Data (MUD) in conjunction with VLDB},
    Pages = {3-17},
    Title = {\href{http://people.cs.umass.edu/ameli/projects/causality/papers/MUD2010.pdf}{{Why So}? or {Why No}? Functional Causality for Explaining Query Answers}},
    Venue = {MUD},
    address = {Singapore},
    month = {September},
    Year = {2010}
}