Radar station: Using KG embeddings for semantic table interpretation and entity disambiguation

Liu, Jixiong; Huynh, Viet-Phi; Chabot, Yoan; Troncy, Raphaël
ISWC 2022, 21st International Semantic Web Conference, 23-27 October 2022, Hangzhou, China (Virtual Event) / Part of the Lecture Notes in Computer Science book series (LNCS, Vol. 13489)

Relational tables are widely used to store information about entities and their attributes and they are the de-facto format for training AI algorithms. Numerous Semantic Table Interpretation approaches have been proposed in particular for the so-called cell-entity annotation task aiming at disambiguating the values of table cells given reference
knowledge graphs (KGs). Among these methods, heuristic-based ones have demonstrated to be the ones reaching the best performance, often relying on the column types and on the inter-column relationships aggregated by voting strategies. However, they often ignore other columnwised semantic similarities and are very sensitive to error propagation (e.g. if the type annotation is incorrect, often such systems propagate the entity annotation error in the target column). In this paper, we propose Radar Station, a hybrid system that aims to add a semantic disambiguation step after a previously identified cell-entity annotation. Radar Station takes into account the entire column as context and uses graph embeddings to capture latent relationships between entities to improve their disambiguation. We evaluate Radar Station using several graph embedding models belonging to different families on Web tables as well as on synthetic datasets. We demonstrate that our approach can lead to an accuracy improvement of 3% compared to the heuristics-based systems. Furthermore, we empirically observe that among the various graph embeddings families, the ones relying on fine-tuned translation distance show superior performance compared to other models.

DOI
Type:
Conference
City:
Hangzhou
Date:
2022-10-23
Department:
Data Science
Eurecom Ref:
6983
Copyright:
© Springer. Personal use of this material is permitted. The definitive version of this paper was published in ISWC 2022, 21st International Semantic Web Conference, 23-27 October 2022, Hangzhou, China (Virtual Event) / Part of the Lecture Notes in Computer Science book series (LNCS, Vol. 13489) and is available at : https://doi.org/10.1007/978-3-031-19433-7_29

PERMALINK : https://www.eurecom.fr/publication/6983