Speech overlap detection using convolutive non-negative sparse coding: New improvements and Insights

Geiger, Jürgen; Vipperla, Ravichander; Evans, Nicholas; Schuller, Björn; Rigoll, Gerhard
EUSIPCO 2012, European Signal Processing Conference, August, 27-31, 2012, Bucharest, Romania

This paper presents recent advances in the application of convolutive non-negative sparse coding (CNSC) to the problem of overlap detection in the context of conference meetings

and speaker diarization. CNSC is used to project a mixed speaker signal onto separate speaker bases and hence to detect intervals of competing speech. We present new energy ratio and total energy featureswhich give significant improvements over our previous work. The system is assessed using a subset of the AMI meeting corpus. We report results which are comparable to the state of the art which support the potential of a new approach to overlap detection. An analysis of system performance highlights the importance of further work to addressesweaknesses in detecting particularly short segments of overlapping speech.


Type:
Conference
City:
Bucharest
Date:
2012-08-27
Department:
Digital Security
Eurecom Ref:
3729
Copyright:
© EURASIP. Personal use of this material is permitted. The definitive version of this paper was published in EUSIPCO 2012, European Signal Processing Conference, August, 27-31, 2012, Bucharest, Romania and is available at :

PERMALINK : https://www.eurecom.fr/publication/3729