Some contributions to joint optimal filtering and parameter estimation with application to monaural speech separation

Bensaid, Siouar

Thesis

Noise and interference suppression is a fundamental problem that signal processing community strive to resolve. Particularly in audio processing, having "clean" sounds is very crucial into many applications such as speech recognition, speech decoding, music transcription, etc. Unfortunately, in real life, noise/interference-free environment does not exist.

In the first part of this thesis, we propose two main mono-microphone speech separation algorithms. In the first algorithm, we exploit the joint autoregressive model that models short and long (periodic) correlations of Gaussian speech signals to formulate a state space model with unknown parameters. The EM-Kalman algorithm is then used to estimate jointly the sources (involved in the state vector) and the parameters of the model. In the second algorithm, we use the same speech model but this time in the frequency domain (quasi-periodic Gaussian sources with AR spectral envelope). Observation data is sliced using a well-designed window. Parameters are estimated separately from the sources by optimizing the Gaussian ML criterion expressed using the sample and parameterized covariance matrices. Classical frequency domain asymptotic methods replace linear convolution by circulant convolution leading to approximation errors. We show how the introduction of windows can lead to slightly more complex frequency domain techniques, replacing diagonal covariance matrices by banded covariance matrices, but with controlled approximation error. The sources are then estimated using the Wiener filtering.

In the second part of the thesis, we consider the problem of linear MMSE (LMMSE) estimation (such as Wiener and Kalman filtering) in the presence of a number of unknown parameters in the second-order statistics that need to be estimated also. This well-known joint filtering and parameters estimation problem has numerous applications. It is a hybrid estimation problem in which the signal estimated by linear filtering is random, and the unknown parameters are deterministic. As the signal is random, it can also be eliminated (marginalized), allowing parameters estimation from the marginal distribution of data. An intriguing question is then the relative performance of joint vs. marginalized parameters estimation. In this part, we consider jointly Gaussian signal and data and we first provide contributions to Cramér-Rao bound. We characterize the difference between the hybrid information matrix (HIM) and the classical marginalized Fisher information matrix (FIM) on the one hand, and between the FIM (with CRB asymptotically attained by the maximum likelihood (ML)) and the popular modified FIM, modified FIM (MFIM), the inverse of modified CRB which is a loose bound. We then investigate three iterative (alternating optimization) joint estimation approaches: Alternating maximum a posteriori (MAP) for signal and ML for parameters alternating MAP/ML (AMAPML), which in spite of a better HIM suffers from inconsistent parameters bias, EM which converges to (marginalized) ML (but with AMAPML signal estimate), and variational Bayes (VB) which yields an improved signal estimate with the parameters estimate asymptotically becoming ML.

Detail

Document

HAL

BIBTEX

Type:

Thèse

Date:

2014-06-06

Department:

Systèmes de Communication

Eurecom Ref:

4308