NERD: evaluating named entity recognition tools in the web of data

Rizzo, Giuseppe; Troncy, Raphaël
ISWC 2011, Workshop on Web Scale Knowledge Extraction (WEKEX'11), October 23-27, 2011, Bonn, Germany

The Web of data promotes the idea that more and more data are interconnected. A step towards this goal is to bring more structured annotations to existing documents using common vocabularies or ontologies. Semi-structured texts such as scientific, medical or news articles as well as forum and archived mailing list threads or (micro-)blog posts can hence be semantically annotated. Named Entity (NE) extractors play a key role for extracting structured information by identifying features, also called entities, and by linking them to other web resources by means of typed inferences. In this article, we propose a thorough evaluation of five popular Linked Data entity extractors which expose APIs: AlchemyAPI, DBPedia Spotlight, Extractiv, OpenCalais and Zemanta. We present NERD, an evaluation framework we have developed and the results of a controlled evaluation performed by human beings that consists in assigning a Boolean value to three criteria: entity detection, entity type and entity disambiguation.


Type:
Conférence
City:
Bonn
Date:
2011-10-23
Department:
Data Science
Eurecom Ref:
3517
Copyright:
© Springer. Personal use of this material is permitted. The definitive version of this paper was published in ISWC 2011, Workshop on Web Scale Knowledge Extraction (WEKEX'11), October 23-27, 2011, Bonn, Germany and is available at :

PERMALINK : https://www.eurecom.fr/publication/3517