Graduate School and Research Center in Digital Sciences

Convolutive non-negative sparse coding and new features for speech overlap handling in speaker diarization

Geiger, Juergen; Vipperla, Ravichander; Bozonnet, Simon; Evans, Nicholas; Schuller, Bjoern; Rigoll, Gerhard

INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Volume3, September 9-13, Portland, Oregon, USA

The effective handling of overlapping speech is at the limits of the current state-of-the-art in speaker diarization. This paper presents our latest work in overlap detection. We report the combination of features derived through convolutive nonnegative sparse coding and new energy, spectral and voicingrelated features within a conventional HMM system. Overlap detection results are fully integrated into our top-down diarization system through the application of overlap exclusion and overlap labeling. Experiments on a subset of the AMI corpus show that the new system delivers significant reductions in missed speech and speaker error. Through overlap exclusion and labelling the overall diarization error rate is shown to improve by 6.4 % relative.

Document Bibtex

Title:Convolutive non-negative sparse coding and new features for speech overlap handling in speaker diarization
Keywords:speech overlap detection, convolutive nonnegative sparse coding, speaker diarization
Type:Conference
Language:English
City:Portland
Country:UNITED STATES
Date:
Department:Digital Security
Eurecom ref:3733
Copyright: © ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Volume3, September 9-13, Portland, Oregon, USA and is available at :
Bibtex: @inproceedings{EURECOM+3733, year = {2012}, title = {{C}onvolutive non-negative sparse coding and new features for speech overlap handling in speaker diarization}, author = {{G}eiger, {J}uergen and {V}ipperla, {R}avichander and {B}ozonnet, {S}imon and {E}vans, {N}icholas and {S}chuller, {B}joern and {R}igoll, {G}erhard }, booktitle = {{INTERSPEECH} 2012, 13th {A}nnual {C}onference of the {I}nternational {S}peech {C}ommunication {A}ssociation, {V}olume3, {S}eptember 9-13, {P}ortland, {O}regon, {USA}}, address = {{P}ortland, {UNITED} {STATES}}, month = {09}, url = {http://www.eurecom.fr/publication/3733} }
See also: