Ecole d'ingénieur et centre de recherche en Sciences du numérique

Analysis of named entity recognition and linking for tweets

Derczynski, Leon; Maynard, Diana; Rizzo, Giuseppe; van Erp, Marieke; Aswani, Niraj; Troncy, Raphaël; Bontcheva, Kalina

Information Processing and Management, Volume 51, N°2, March 2015, Elsevier

Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.

Document Doi Arxiv Bibtex

Titre:Analysis of named entity recognition and linking for tweets
Mots Clés:Information extraction; Named entity recognition; Entity disambiguation; Microblogs; Twitter
Type:Journal
Langue:English
Ville:
Date:
Département:Data Science
Eurecom ref:4250
Copyright: © Elsevier. Personal use of this material is permitted. The definitive version of this paper was published in Information Processing and Management, Volume 51, N°2, March 2015, Elsevier and is available at : http://dx.doi.org/10.1016/j.ipm.2014.10.006
Bibtex: @article{EURECOM+4250, doi = {http://dx.doi.org/10.1016/j.ipm.2014.10.006}, year = {2015}, month = {03}, title = {{A}nalysis of named entity recognition and linking for tweets}, author = {{D}erczynski, {L}eon and {M}aynard, {D}iana and {R}izzo, {G}iuseppe and van {E}rp, {M}arieke and {A}swani, {N}iraj and {T}roncy, {R}apha{\"e}l and {B}ontcheva, {K}alina}, journal = {{I}nformation {P}rocessing and {M}anagement, {V}olume 51, {N}°2, {M}arch 2015, {E}lsevier}, url = {http://www.eurecom.fr/publication/4250} }
Voir aussi: