N. W. D. Evans, J. S. Mason
Proc. Eurospeech, volume 2, pages 893-896, 2001
Abstract: Automatic speech recognition performance tends to be degraded in noisy conditions. Spectral subtraction is a simple, popular approach of noise compensation. In conventional spectral subtraction, noise statistics are updated during speech gaps and subtracted from a corrupt signal during speech intervals. Some means of explicit speech, non-speech detection is therefore essential. Recent proposals have avoided the problem of speech, non-speech detection by continually updating noise estimates whether speech is present or not. In this paper, we evaluate two such approaches of noise estimation and compare their performance with standard noise estimation in hand-labelled speech gaps. Experimental results are reported with the conventional spectral subtraction framework on a 1500 speaker database. Results confirm that such approaches of noise estimation which do not rely on explicit speech, non-speech detection compare favourably with conventional noise estimation approaches.