Ecole d'ingénieur et centre de recherche en Sciences du numérique

The speed submission to DIHARD II: Contributions and lessons learned

Sahidullah, Md; Patino, Jose; Cornell, Samuele; Yin, Ruiqing; Sivasankaran, Sunit; Bredin, Hervé; Korshunov, Pavel; Brutti, Alessio; Serizel, Romain; Vincent, Emmanuel; Evans, Nicholas: Marcel, Sébastien ; Squartini, Stefano; Barras, Claude

Idiap-RR-14-2019, Idiap Research Report, November 2019

This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team. Besides describing the system, which considerably outperformed the challenge baselines, we also focus on the lessons learned from numerous approaches that we tried for single and multi-channel systems. We present several components of our diarization system, including categorization of domains, speech enhancement, speech activity detection, speaker embeddings, clustering methods, resegmentation, and system fusion. We analyze and discuss the effect of each such component on the overall diarization performance within the realistic settings of the challenge.

Document Arxiv Hal Bibtex

Titre:The speed submission to DIHARD II: Contributions and lessons learned
Mots Clés:Diarization, DIHARD challenge, evaluation, single-channel and multi-channel speech
Département:Sécurité numérique
Eurecom ref:6106
Copyright: Idiap
Bibtex: @techreport{EURECOM+6106, year = {2019}, title = {{T}he speed submission to {DIHARD} {II}: {C}ontributions and lessons learned}, author = {{S}ahidullah, {M}d and {P}atino, {J}ose and {C}ornell, {S}amuele and {Y}in, {R}uiqing and {S}ivasankaran, {S}unit and {B}redin, {H}erv{\'e} and {K}orshunov, {P}avel and {B}rutti, {A}lessio and {S}erizel, {R}omain and {V}incent, {E}mmanuel and {E}vans, {N}icholas: {M}arcel, {S}{\'e}bastien and {S}quartini, {S}tefano and {B}arras, {C}laude}, number = {EURECOM+6106}, month = {11}, institution = {Eurecom}, url = {},, }
Voir aussi: