N. W. D. Evans, J. S. Mason, M. J. Roach
Proc. 4th IASTED International Conference on Signal and Image Processing, pages 157-161, 2002
Abstract: This paper describes the application of morphological filtering to speech spectrograms for noise robust automatic speech recognition. Speech regions of the spectrogram are identified based on the proximity of high energy regions to neighbouring high energy regions in the three-dimensional space. The process of erosion can remove noise while dilation can then restore any erroneously removed speech regions. The combination of the two techniques results in a non-linear, time-frequency filter. Automatic speech recognition experiments are performed on the AURORA database and results show an average relative improvement of 10% is delivered with the morphological filtering approach. When combined with quantile-based noise estimation and non-linear spectral subtraction, the average relative performance improvement is also 10% but with a different performance profile in terms of SNR.