Ecole d'ingénieur et centre de recherche en Sciences du numérique

Speech overlap detection using convolutive non-negative sparse coding

Vipperla, Ravichander; Wang, Dong; Bozonnet, Simon; Evans, Nicholas

Research Report RR-11-257

Overlapping speech is known to degrade speaker diarization performance with impacts on both speech activity detection, speaker clustering and segmentation (speaker error). While previous related work has made important advances the problem remains largely unsolved. This paper reports early work to investigate the application of non-negative matrix factorisation (NMF) to the overlap problem. NMF aims to decompose a composite signal into its underlying contributory parts and is thus naturally suited to tasks of detecting overlap and its attribution to contributing speakers. With additional sparse constraints the algorithm is shown to be effective in identifying overlapping speech and gives a relative improvement of 11% in terms of equal error rate over a baseline approach based on conventional Gaussian mixture models. Experiments with source attribution show a relative improvement in the order of 40%.

Document Bibtex

Titre:Speech overlap detection using convolutive non-negative sparse coding
Type:Rapport
Langue:English
Ville:
Date:
Département:Sécurité numérique
Eurecom ref:3423
Copyright: © EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Research Report RR-11-257 and is available at :
Bibtex: @techreport{EURECOM+3423, year = {2011}, title = {{S}peech overlap detection using convolutive non-negative sparse coding}, author = {{V}ipperla, {R}avichander and {W}ang, {D}ong and {B}ozonnet, {S}imon and {E}vans, {N}icholas}, number = {EURECOM+3423}, month = {06}, institution = {Eurecom}, url = {http://www.eurecom.fr/publication/3423},, }
Voir aussi: