Acoustic context information may be used by microphone-equipped devices in order to
adapt their behaviour or configuration according to a particular scenario. Recognition of such scenarios according to the acoustic context is the goal of acoustic scene classification (ASC). The choice of audio sensors, instead of alternatives (e.g. motion or light sensors), is a natural one; almost all mobile and smart devices are equipped with at least one microphone. Almost all previous solutions to ASC rely on feature extraction approaches designed specifically for speech and music genre recognition and are thus not necessarily optimal for ASC. Further limitations of existing solutions relate to the requirements for real-time and low footprint implementations. These requirements must be met in order that ASC algorithms can be developed for low power, always listening devices. The work reported in this thesis aims to address these limitations and hence to reduce the gap between academic and industrial research in terms of methods, protocols and metrics. Accordingly, this thesis presents the ASC problem from a dual perspective. This includes contributions in both fundamental research, which report contributions with respect to standard protocols and methods in addition to applied research, which describes contributions to the adaptation of current methods to 'real-world' applications. The main contributions of the work include: (i) the design of ASC-tailored features which exploit spectro-temporal patterns from spectrograms using local binary pattern analysis; (ii) techniques for the automatic extraction of the most discriminative spectro-temporal patterns through the application of convolutional neural networks; (iii) the collection of a large database of realistic, low-quality audio recordings to support work in ASC; (iv) the implementation of an always-listening, low-complexity ASC system, and (v) the first investigation of ASC in an open-set scenario, a new classifier tailored to open-set classification and new protocols and metrics for the assessment of open-set ASC. The work presented in this thesis demonstrates that greater synergy between fundamental and applied research must become the standard pathway to future work with a view to creating practical, usable ASC techniques.
© TELECOM ParisTech. Personal use of this material is permitted. The definitive version of this paper was published in Thesis and is available at :