Convolutive non-negative sparse coding and new features for speech overlap handling in speaker diarization

Geiger, Juergen; Vipperla, Ravichander; Bozonnet, Simon; Evans, Nicholas; Schuller, Bjoern; Rigoll, Gerhard
INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Volume3, September 9-13, Portland, Oregon, USA

The effective handling of overlapping speech is at the limits of the current state-of-the-art in speaker diarization. This paper presents our latest work in overlap detection. We report

the combination of features derived through convolutive nonnegative sparse coding and new energy, spectral and voicingrelated features within a conventional HMM system. Overlap

detection results are fully integrated into our top-down diarization system through the application of overlap exclusion and overlap labeling. Experiments on a subset of the AMI corpus show that the new system delivers significant reductions in missed speech and speaker error. Through overlap exclusion and labelling the overall diarization error rate is shown to improve by 6.4 % relative.


DOI
Type:
Conference
City:
Portland
Date:
2012-09-09
Department:
Digital Security
Eurecom Ref:
3733
Copyright:
© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Volume3, September 9-13, Portland, Oregon, USA and is available at : http://dx.doi.org/10.21437/Interspeech.2012-575

PERMALINK : https://www.eurecom.fr/publication/3733