C. Fredouille, N. W. D. Evans
Lecture Notes on Computer Science, CLEAR 2007 and RT 2007, Multimodal Technologies for Perception of Humans, volume 4625/2008, pages 520-532, 2008
Abstract: This paper presents the LIA submission to the speaker diarization task of the 2007 NIST Rich Transcription (RT'07) evaluation campaign. We report a system optimised for conference meeting recordings and experiments on all three RT'07 subdomains and microphone conditions. Results show that, despite state-of-the-art performance for the single distant microphone (SDM) condition, in its current form the system is not effective in utilising the additional information that is available with the multiple distant microphone (MDM) condition. With post evaluation tuning we achieve a DER of 19% on the MDM task with conference meeting data. Some early experimental work highlights both the limitations and potential of utilising between-channel delay features for diarization.