Scrutinizer: A mixed-initiative approach to large-scale, data-driven claim verification

Karagiannis, Georgios; Saeed, Mohammed; Papotti, Paolo; Trummer, Immanuel
VLDB 2020, 46th International Conference on Very Large Data Bases, 31 August-4 September 2020, Tokyo, Japan (Virtual Conference) / To be published in PVLDB 2020, Proceedings of the VLDB Endowment, Vol.13, N°11, August 2020

Organizations spend signi cant amounts of time and money to manually fact check text documents summarizing data. The goal of the Scrutinizer system is to reduce veri cation overheads by supporting human fact checkers in translating text claims into SQL queries on an database. Scrutinizer coordinates teams of human fact checkers. It reduces veri fication time by proposing queries or query fragments to the users. Those proposals are based on claim text classi ers, that gradually improve during the veri cation of a large document. In addition, Scrutinizer uses tentative execution of query candidates to narrow down the set of alternatives. The veri cation process is controlled by a cost-based optimizer. It optimizes the interaction with users and prioritizes claim veri cations. For the latter, it considers expected verifi cation overheads as well as the expected claim utility as training samples for the classi ers. We evaluate the Scrutinizer system using simulations and a user study with professional fact checkers, based on actual claims and data. Our experiments consistently demonstrate signi cant savings in veri cation time, without reducing result accuracy.


DOI
Type:
Conférence
City:
Tokyo
Date:
2020-08-31
Department:
Data Science
Eurecom Ref:
6216
Copyright:
© ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in VLDB 2020, 46th International Conference on Very Large Data Bases, 31 August-4 September 2020, Tokyo, Japan (Virtual Conference) / To be published in PVLDB 2020, Proceedings of the VLDB Endowment, Vol.13, N°11, August 2020 https://doi.org/10.14778/3407790.3407841

PERMALINK : https://www.eurecom.fr/publication/6216