Ecole d'ingénieur et centre de recherche en Sciences du numérique

Phone adaptive training for speaker diarization

Bozonnet, Simon; Vipperla, Ravichander; Evans, Nicholas

INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, September 9-13, Portland, Oregon, USA

The linguistic content of a speech signal is a source of unwanted variation which can degrade speaker diarization performance. This paper presents our latest work to reduce its impact. The new approach, referred to as Phone Adaptive Training (PAT), is analogous to speaker adaptive training used in automatic speech recognition. We report an oracle experiment which shows that PAT has the potential to deliver a 33% relative improvement in the diarization error rate of our baseline system. Practical experiments show significant improvements across two standard, independent evaluation datasets.

Document Hal Bibtex

Titre:Phone adaptive training for speaker diarization
Mots Clés:Speaker Diarization, Phone Adaptive Training, Speaker Discrimination
Type:Conférence
Langue:English
Ville:Portland
Pays:ÉTATS-UNIS
Date:
Département:Sécurité numérique
Eurecom ref:3732
Copyright: © ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, September 9-13, Portland, Oregon, USA and is available at :
Bibtex: @inproceedings{EURECOM+3732, year = {2012}, title = {{P}hone adaptive training for speaker diarization}, author = {{B}ozonnet, {S}imon and {V}ipperla, {R}avichander and {E}vans, {N}icholas}, booktitle = {{INTERSPEECH} 2012, 13th {A}nnual {C}onference of the {I}nternational {S}peech {C}ommunication {A}ssociation, {S}eptember 9-13, {P}ortland, {O}regon, {USA} }, address = {{P}ortland, {\'{E}}{TATS}-{UNIS}}, month = {09}, url = {http://www.eurecom.fr/publication/3732} }
Voir aussi: