Towards a new image-based spectrogram segmentation speech coder optimised for intelligibility

Jellyman, Keith A; Evans, Nicholas; Liu, WM; Mason, J S D

MMM 2009, 15th international Multimedia Modeling Conference, January 7-9, 2009, Sophia-Antipolis, France / Also published in Springer "LNCS" 5371/2009

Speech intelligibility is the very essence of communications. When high noise can degrade a speech signal to the threshold of intelligibility, for example in mobile and military applications, introducing further degradation by a speech coder could prove critical. This paper investigates concepts towards a new speech coder that draws upon the field of image processing in a new multimedia approach. The coder is based on a spectrogram segmentation image processing procedure. The design criterion is for minimal intelligibility loss in high noise, as opposed to the conventional quality criterion, and the bit rate must be reasonable. First phase intelligibility listening test results assessing its potential alongside six standard coders are reported. Experimental results show the somewhat surprising robustness of the LD-CELP coder, and the potential of the new coder with particularly good results in car noise conditions below -4.0dB.

Detail

Document

DOI

BIBTEX

Type:

Conférence

City:

Sophia-Antipolis

Date:

2009-01-07

Department:

Sécurité numérique

Eurecom Ref:

2647

© Springer. Personal use of this material is permitted. The definitive version of this paper was published in MMM 2009, 15th international Multimedia Modeling Conference, January 7-9, 2009, Sophia-Antipolis, France / Also published in Springer "LNCS" 5371/2009 and is available at : http://dx.doi.org/10.1007/978-3-540-92892-8