This paper presents LIA-EURECOM's joint submission to the NIST Rich Transcription 2009 (RT'09) speaker diarization evaluation. We describe a number of modifications to our previous system wich involve beamforming for the multiple distant microphone (MDM) cndition and also significant enhancements to the speaker segmentation sage of the core speaker diarization system. These modifications lead to improvements in both speech activity detection (MDM only) and also to overall diarization performance. We present experimental results on a development set of 23 shows and the RT'07 dataset, which was used for validation. Experimental results on the latter show a relative improvement in DER of 27% is achieved with our new system on the MDM condition. Similar experiments on the RT'09 dataset show a relative improvement in DER of 35%. Our results for the MDM condition compare reasonably well with those of others even if, other than for beamforming, we did not use any delay features. Results for the single distant microphone condition (SDM) compare especially well with others' work and highlight the merit of our top-down, evolutive hidden Markov model (E-HMM) approach to speaker diarization.
The LIA-EURECOM RT`09 Speaker Diarization System
RT 2009, NIST Rich Transcription Workshop, May 28-29, 2009, Melbourne, USA
© NIST. Personal use of this material is permitted. The definitive version of this paper was published in RT 2009, NIST Rich Transcription Workshop, May 28-29, 2009, Melbourne, USA and is available at :
PERMALINK : https://www.eurecom.fr/publication/2763