Semi-supervised on-line speaker diarization for meeting data with incremental maximum a-posteriori adaptation

Soldi, Giovanni; Todisco, Massimiliano; Delgado, Hector; Beaugeant, Christophe; Evans, Nicholas
ODYSSEY 2016, The Speaker and Language Recognition Workshop, June 21-24, 2016, Bilbao, Spain

Almost all current diarization systems are off-line and illsuited to the growing need for on-line or real-time diarization. Our previous work reported the first on-line diarization system for the most challenging speaker diarization domain involving meeting data captured with a single distant microphone (SDM). Even if results were not dissimilar to those reported for online diarization in less challenging domains, error rates were high and unlikely to support any practical applications. The first novel contribution in this paper relates to the investigation of a semi-supervised approach to on-line diarization whereby speaker models are seeded with a modest amount of manually labelled data. In practical applications involving meetings, such data can be obtained readily from brief round-table introductions. The second novel contribution relates to a incremental MAP adaptation procedure for efficient, on-line speaker modelling. When combined, these two developments provide an online diarization system which outperforms a baseline, off-line system by a significant margin. When configured appropriately, error rates may be low enough to support practical applications. 


DOI
Type:
Conference
City:
Bilbao
Date:
2016-06-21
Department:
Digital Security
Eurecom Ref:
4856
Copyright:
© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in ODYSSEY 2016, The Speaker and Language Recognition Workshop, June 21-24, 2016, Bilbao, Spain and is available at : http://dx.doi.org/10.21437/Odyssey.2016-55

PERMALINK : https://www.eurecom.fr/publication/4856