Nicholas Evans, Corinne Fredouille and Jean-François Bonastre
ICASSP 2009, International conference on Acoustics, Speech and Signal Processing. April 19-24, 2009, Taipei, Taiwan
Abstract: When multiple microphones are available estimates of inter-channel delay, which characterise a speaker’s location, can be used as features for speaker diarization. Background noise and reverberation can, however, lead to noisy features and poor performance. To ameliorate these problems, this paper presents a new approach to the discriminant analysis of delay features for speaker diarization. This novel and onetheless unsupervised approach aims to increase speaker separability in delay-space. We assess the approach on subsets of four standard NIST RT datasets and demonstrate a relative improvement in diarization error rate of 25% on a separate evaluation set using delay features alone.