Ambiguity detection and textual claims generation from relational data

Veltri, Enzo; Santoro, Donatello; Badaro, Gilbert; Saeed, Mohammed; Papotti, Paolo
SEBD 2022, 30th Symposium on Advanced Database System, 19-22 June 2022, Tirrenia (Pisa), Italy

Computational fact checking, (given a textual claim and a table, verify if the claim holds w.r.t. the given data) and data-to-text generation (given a subset of cells, produce a sentence describing them) exploit the relationship between relational data and natural language text. Despite promising results in these areas, state-of-the-art solutions simply fail in managing “data-ambiguity", i.e., the case when there are multiple interpretations of the relationship between the textual sentence and the relational data. To tackle this problem, we present Pythia, a system that, given a relational table ??, generates textual sentences that contain factual ambiguities w.r.t. the data in ??. Such sentences can then be used to train target applications in handling data-ambiguity. In this paper, we discuss how Pythia generates data ambiguous sentences for a given table in an unsupervised fashion using data profiling and query generation. We then show how two existing downstream applications, namely data-to-text and computational fact checking, benefit from Pythia’s generated sentences, improving the state-of-the-art results without manual user effort.


Type:
Conférence
City:
Tirrenia
Date:
2022-06-19
Department:
Data Science
Eurecom Ref:
6939
Copyright:
CEUR

PERMALINK : https://www.eurecom.fr/publication/6939