Ecole d'ingénieur et centre de recherche en télécommunications

System output combination for improved speaker diarization

Bozonnet, Simon; Evans, Nicholas W D; Anguera, X; Vinyals, O; Friedland, G; Fredouille, Corinne

INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, September 26-30, 2010, Makuhari, Japan

System combination or fusion is a popular, successful and       sometimes straightforward means of improving performance in       many fields of statistical pattern classification, including speech       and speaker recognition. Whilst there is significant work in       the literature which aims to improve speaker diarization performance       by combining multiple feature streams, there is little       work which aims to combine the outputs of multiple systems.       This paper reports our first attempts to combine the outputs of       two state-of-the-art speaker diarization systems, namely ICSI's       bottom-up and LIA-EURECOM's top-down systems. We show       that a cluster matching procedure reliably identifies corresponding       speaker clusters in the two system outputs and that, when       they are used in a new realignment and resegmentation stage,       the combination leads to relative improvements of 13% and 7%       DER on independent development and evaluation sets.

Document Hal Bibtex

Mots Clés:speaker diarization, system combination, fusion
Type:Conférence
Langue:English
Ville:Makuhari
Pays:JAPON
Date:
Département:Communications Multimédia
Eurecom ref:3155
Copyright: © ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, September 26-30, 2010, Makuhari, Japan and is available at :
Bibtex: @inproceedings{EURECOM+3155, year = {2010}, title = {{S}ystem output combination for improved speaker diarization}, author = {{B}ozonnet, {S}imon and {E}vans, {N}icholas {W} {D} and {A}nguera, {X} and {V}inyals, {O} and {F}riedland, {G} and {F}redouille, {C}orinne }, booktitle = {{INTERSPEECH} 2010, 11th {A}nnual {C}onference of the {I}nternational {S}peech {C}ommunication {A}ssociation, {S}eptember 26-30, 2010, {M}akuhari, {J}apan}, address = {{M}akuhari, {JAPON}}, month = {09}, url = {http://www.eurecom.fr/publication/3155} }
Voir aussi: