Ecole d'ingénieur et centre de recherche en Sciences du numérique

Enhanced low-latency speaker spotting using selective cluster enrichment

Patino, José. Delgado, Hector; Evans, Nicholas

BIOSIG 2018, 17th International Conference of the Biometrics Special Interest Group, 26-29 September 2018, Darmstadt, Germany

Low-latency speaker spotting (LLSS) calls for the rapid detection of known speakers within multi-speaker audio streams. While previous work showed the potential to develop efficient LLSS solutions by combining speaker diarization and speaker detection within an online processing framework, it failed to move significantly beyond the traditional definition of diarization. This paper shows that the latter needs rethinking and that a diarization sub-system tailored to the end application, rather than to the minimisation of the diarization error rate, can improve LLSS performance. The proposed selective cluster enrichment algorithm is used to guide the diarization system to better model segments within a multi-speaker audio stream and hence detect more reliably a given target speaker. The LLSS solution reported in this paper shows that target speakers can be detected with a 16% equal error rate after having been active in multi-speaker audio streams for only 15 seconds.

Document Doi Bibtex

Titre:Enhanced low-latency speaker spotting using selective cluster enrichment
Mots Clés:low-latency speaker spotting, speaker detection, speaker diarization
Type:Conférence
Langue:English
Ville:Darmstadt
Pays:ALLEMAGNE
Date:
Département:Sécurité numérique
Eurecom ref:5702
Copyright: © 2018 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Bibtex: @inproceedings{EURECOM+5702, doi = {http://dx.doi.org/10.23919/BIOSIG.2018.8553619}, year = {2018}, title = {{E}nhanced low-latency speaker spotting using selective cluster enrichment}, author = {{P}atino, {J}os{\'e}. {D}elgado, {H}ector and {E}vans, {N}icholas}, booktitle = {{BIOSIG} 2018, 17th {I}nternational {C}onference of the {B}iometrics {S}pecial {I}nterest {G}roup, 26-29 {S}eptember 2018, {D}armstadt, {G}ermany }, address = {{D}armstadt, {ALLEMAGNE}}, month = {09}, url = {http://www.eurecom.fr/publication/5702} }
Voir aussi: