Ecole d'ingénieur et centre de recherche en Sciences du numérique

Benchmarking the extraction and disambiguation of named entities on the semantic web

Rizzo, Giuseppe; van Erp, Marieke; Troncy, Raphaël

LREC 2014, 9th International Conference on Language Resources and Evaluation, May 26-31, 2014, Reykjavik, Iceland

Named entity recognition and disambiguation are of primary importance for extracting information and for populating knowledge bases. Detecting and classifying named entities has traditionally been taken on by the natural language processing community, whilst linking of entities to external resources, such as those in DBpedia, has been tackled by the Semantic Web community. As these tasks are treated in different communities, there is as yet no oversight on the performance of these tasks combined. We present an approach that combines the state-of-the art from named entity recognition in the natural language processing domain and named entity linking from the semantic web community. We report on experiments and results to gain more insights into the strengths and limitations of current approaches on these tasks. Our approach relies on the numerous web extractors supported by the NERD framework, which we combine with a machine learning algorithm to optimize recognition and linking of named entities. We test our approach on four standard data sets that are composed of two diverse text types, namely newswire and microposts.  

Document Bibtex

Titre:Benchmarking the extraction and disambiguation of named entities on the semantic web
Mots Clés:Named Entity Recognition, Named Entity Linking, Machine Learning, Newswire, Microposts
Type:Conférence
Langue:English
Ville:Reykjavik
Pays:ISLANDE
Date:
Département:Data Science
Eurecom ref:4249
Copyright: ELRA
Bibtex: @inproceedings{EURECOM+4249, year = {2014}, title = {{B}enchmarking the extraction and disambiguation of named entities on the semantic web}, author = {{R}izzo, {G}iuseppe and van {E}rp, {M}arieke and {T}roncy, {R}apha{\"e}l}, booktitle = {{LREC} 2014, 9th {I}nternational {C}onference on {L}anguage {R}esources and {E}valuation, {M}ay 26-31, 2014, {R}eykjavik, {I}celand}, address = {{R}eykjavik, {ISLANDE}}, month = {05}, url = {http://www.eurecom.fr/publication/4249} }
Voir aussi: