Pythia: Unsupervised generation of ambiguous textual claims from relational data

Veltri, Enzo; Santoro, Donatello; Badaro, Gilbert; Saeed, Mohammed; Papotti, Paolo

SIGMOD 2022, ACM International Conference on Management of Data, June 12-17, 2022, Philadelphia, PA, USA

Best Demo Paper Award

Applications such as computational fact checking and data-to-text generation exploit the relationship between relational data and natural language text. Despite promising results in these areas, state of the art solutions simply fail in managing “data-ambiguity”, i.e., the case when there are multiple interpretations of the relationship between the textual sentence and the relational data. To tackle this problem, we introduce Pythia, a system that, given a relational table ??, generates textual sentences that contain factual ambiguities w.r.t. the data in ??. Such sentences can then be used to train target applications in handling data-ambiguity. In this demonstration, we first show how our system generates data ambiguous sentences for a given table in an unsupervised fashion by data profiling and query generation. We then demonstrate how two existing applications benefit from Pythia’s generated sentences, improving the state-of-the-art results. The audience will interact with Pythia by changing input parameters in an interactive fashion, including the upload of their own dataset to see what data ambiguous sentences are generated for it.

Detail

Document

DOI

BIBTEX

Type:

Poster / Demo

City:

Philadelphia

Date:

2022-06-12

Department:

Data Science

Eurecom Ref:

6838

© ACM, 2022. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in SIGMOD 2022, ACM International Conference on Management of Data, June 12-17, 2022, Philadelphia, PA, USA https://doi.org/10.1145/3514221.3520164