W. M. Liu, J. S. Mason, N. W. D. Evans, K. A. Jellyman
Proc. ICSLP, 2006
Abstract: This paper investigates the potential applicability of automatic speech recognition (ASR) and 6 well-reported objective quality measures for the task of ranking intelligibility of speech degraded by different real life background noises. In a recent investigation ASR has been reported to give high subjective correlation with human assessment when tested with various system degradations. This paper extends this investigation in two directions. First, the usefulness of the measures in the context of different real-life noises is considered. Second, the direct correspondence between statistics computed by an ASR system and human perceived intelligibility is assessed. Subjective listening tests are carried out to provide ground truth. Results show that ASR and WSS (weighted spectral slope) are the only two measures out of the seven considered to give good correlation with human opinion. Specially noted is performance of ASR with correlations ranging from 0.77 to 0.90.