DAGOBAH: Enhanced scoring algorithms for scalable annotations of tabular data

Huynh, Viet-Phi; Liu, Jixiong; Chabot, Yoan; Labbé, Thomas; Monnin, Pierre; Troncy, Raphaël
SEMTAB 2020, Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab 2020), co-located with the 19th International Semantic Web Conference (ISWC 2020), 5 November 2020, Athens, Greece (Virtual Conference)

We present new approaches used in the DAGOBAH system to perform automatic semantic table interpretation. DAGOBAH semantically annotates tables with Wikidata entities and relations to perform three tasks: Columns-Property Annotation (CPA), Cell-Entity Annotation (CEA) and Column-Type Annotation (CTA). In our system, the initial scores from entity disambiguation influence the CPA output, which, in turn, influences the output of the CEA. Finally, the CTA is computed using the type hierarchy available in the knowledge graph in order to annotate columns with the most suitable fine-grained types. This approach that leverages mutual influences between annotations allows DAGOBAH to obtain very competitive results on all tasks of the SemTab2020 challenge. 


Type:
Conference
City:
Athens
Date:
2020-11-04
Department:
Data Science
Eurecom Ref:
6442
Copyright:
CEUR
See also:

PERMALINK : https://www.eurecom.fr/publication/6442