Reverse Data Management
by Alexandra Meliou, Wolfgang Gatterbauer, Dan Suciu
Abstract:
Database research mainly focuses on forward-moving data flows: source data is subjected to transformations and evolves through queries, aggregations, and view definitions to form a new target instance, possibly with a different schema. This Forward Paradigm underpins most data management tasks today, such as querying, data integration, data mining, etc. We contrast this forward processing with Reverse Data Management (RDM), where the action needs to be performed on the input data, on behalf of desired outcomes in the output data. Some data management tasks already fall under this paradigm, for example updates through views, data generation, data cleaning and repair. RDM is, by necessity, conceptually more difficult to define, and computationally harder to achieve. Today, however, as increasingly more of the available data is derived from other data, there is an increased need to be able to modify the input in order to achieve a desired effect on the output, motivating a systematic study of RDM. We define the Reverse Data Management problem, and classify RDM problems into four categories. We illustrate known examples of RDM problems and classify them under these categories. Finally, we introduce a new type of RDM problem, How-To Queries.
Citation:
Alexandra Meliou, Wolfgang Gatterbauer, and Dan Suciu, Reverse Data Management, PVLDB, vol. 4, no. 11, 2011, pp. 1490–1493.
Bibtex:
@article{DBLP:journals/pvldb/MeliouGS11,
    Abstract = {Database research mainly focuses on forward-moving data flows:
    source data is subjected to transformations and evolves through queries,
    aggregations, and view definitions to form a new target instance, possibly
    with a different schema. This Forward Paradigm underpins most data
    management tasks today, such as querying, data integration, data mining,
    etc. We contrast this forward processing with Reverse Data Management
    (RDM), where the action needs to be performed on the input data, on behalf
    of desired outcomes in the output data. Some data management tasks already
    fall under this paradigm, for example updates through views, data
    generation, data cleaning and repair. RDM is, by necessity, conceptually
    more difficult to define, and computationally harder to achieve. Today,
    however, as increasingly more of the available data is derived from other
    data, there is an increased need to be able to modify the input in order
    to achieve a desired effect on the output, motivating a systematic study
    of RDM. We define the Reverse Data Management problem, and classify RDM
    problems into four categories. We illustrate known examples of RDM
    problems and classify them under these categories. Finally, we introduce a
    new type of RDM problem, How-To Queries.},
    Author = {Alexandra Meliou and Wolfgang Gatterbauer and Dan Suciu},
    Journal = {PVLDB},
    Number = {11},
    Pages = {1490--1493},
    Title = {\href{http://www.vldb.org/pvldb/vol4/p1490-meliou.pdf}{Reverse Data Management}},
    Venue = {PVLDB},
    Volume = {4},
    Year = {2011}
}