N. W. D. Evans, J. S. Mason, W. M. Liu, B. G. B. Fauve
Proc. ICASSP, 2006
Abstract: As with many approaches to noise robust automatic speech recognition (ASR) the benefits of spectral subtraction tend to diminish as noise levels in the order of 0 dB are approached. Whilst the majority of related work focuses on reducing magnitude errors a number of new approaches addressing the often overlooked, additional sources of error have appeared in the literature in recent years. Relatively lacking in the literature, however, is an empirical assessment which compares the effects of each error when noisy speech is processed by spectral subtraction. Such studies are vital in order to appreciate the potential penalty in performance when sources of error are overlooked. The objective in this paper is to assess, through ASR, the performance penalty associated with each source of error when noisy speech is treated with spectral subtraction. Experimental evidence based on two standard European databases and ASR protocols illustrates that, perhaps contrary to popular belief, for noise levels in the order of 0 dB and below, these often overlooked sources of error can lead to non-negligible degradations in performance. Whilst not a new idea, here the original emphasis is a thorough assessment that empirically highlights both the fundamental limitations and potential benefit of including the full complement of errors in the spectral subtraction model.