Ecole d'ingénieur et centre de recherche en Sciences du numérique

Handling overlaps in spoken term detection

Wang, Dong; Evans, Nicholas; Troncy, Raphaël; King, Simon

ICASSP 2011, 36th International Conference on Acoustics, Speech and Signal Processing, May 22-27, 2011, Prague, Czech Republic

Spoken term detection (STD) systems usually arrive at many overlapping detections which are often addressed with some pragmatic approaches, e.g. choosing the best detection to represent all the overlaps. In this paper we present a theoretical study based on a concept of acceptance space. In particular, we present two confidence estimation approaches based on Bayesian and evidence perspectives respectively. Analysis shows that both approaches possess respective advantages and shortcomings, and that their combination has the potential to provide an improved confidence estimation. Experiments conducted on meeting data confirm our analysis and show considerable performance improvement with the combined approach, in particular for out-of-vocabulary spoken term detection with stochastic pronunciation modeling.

Document Doi Bibtex

Titre:Handling overlaps in spoken term detection
Mots Clés:Confidence measurement, stochastic pronunciation modeling, spoken term detection, speech recognition
Type:Conférence
Langue:English
Ville:Prague
Pays:TCHÈQUE, RÉPUBLIQUE
Date:
Département:Sécurité numérique
Eurecom ref:3321
Copyright: © 2011 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Bibtex: @inproceedings{EURECOM+3321, doi = {http://dx.doi.org/10.1109/ICASSP.2011.5947643 }, year = {2011}, title = {{H}andling overlaps in spoken term detection}, author = {{W}ang, {D}ong and {E}vans, {N}icholas and {T}roncy, {R}apha{\"e}l and {K}ing, {S}imon}, booktitle = {{ICASSP} 2011, 36th {I}nternational {C}onference on {A}coustics, {S}peech and {S}ignal {P}rocessing, {M}ay 22-27, 2011, {P}rague, {C}zech {R}epublic }, address = {{P}rague, {TCH}{\`{E}}{QUE}, {R}{\'{E}}{PUBLIQUE}}, month = {05}, url = {http://www.eurecom.fr/publication/3321} }
Voir aussi: