|
|
Multimedia Communications |
Back
|
|
|
|
|
|
|
|
Resume |
Luca Brayda received in November 2003 the M.S. in computer science engineering from Politecnico di Torino, Italy and the M.S. degree in computer vision from Université de Nice Sophia-Antipolis, France. He did an internship at Panasonic Speech Technology Laboratory in Santa Barbara, CA, USA, from April to September 2003, working on signal and model-based noise compensation methods for robust automatic speech recognition. He started a Ph.D. thesis at the Eurécom Institute on robust speech recognition with microphone arrays with Professor Christian Wellekens. He obtained his PhD (UNSA) on the 24th april 2007: Title: Multiple Hypothesis Feedback for Robust Speech Recognition with a Microphone Array input
Abstract : Recognizing speech in real environments is as much difficult as the amount of noise increases and the speaker is far from the microphone. Recent studies showed that speech quality in terms of signal to noise ratio (SNR) can be increased using microphone arrays. By exploiting the spatial correlation among multi-channel signals, one can steer the array toward the speaker (beamforming).
This can be done by simply exploiting inter-channel destructive interference of noise with a delay-and-sum technique, where inter-sensor delays are estimated and applied to each channel signal. Alternatively, per-channel filters (filter-and-sum) can be implemented: these filters can be fixed or adapted on a per-channel or per-frame basis, depending on the chosen criterion. In this work we address the problem that increasing the SNR does not imply increasing recognition performance to the same extent. Seltzer (2004) proposes to apply an adaptive filter-and-sum beamformer based on a Maximum Likelihood criterion (Limabeam) rather than on the SNR. In this method, filters are adapted in an unsupervised way using clean speech models which best align noisy speech features. Then the recognizer uses the sum of the filtered signals to generate a final transcription. In this thesis we show that considering in parallel N-best hypotheses instead of the best one, prior to optimization, can increase recognition performance close to that of a supervised algorithm: in fact after the parallel optimizations the N-best list is automatically re-ranked and recognition errors can be recovered. The framework of the N-best Limabeam was tested when significant additive noise is present. Furthermore, the potential of delay-and-sum beamforming, of Limabeam and of the proposed framework was studied in a very reverberant meeting room, where the collected database mimic different talker positions and head orientations: the purpose is to estimate recognition-oriented filters or exploiting additional information related to the environment such as the room impulse responses.
|
|
|
|
Publications |
| Eurecom Reference | 2007 |
| 2215 |
Brayda, Luca Giulio
Multiple hypothesis feedback for robust speech recognition with a microphone array input
Thesis
Details
BibTeX
File request
|
| Eurecom Reference | 2006 |
| 2131 |
Brayda, Luca;Wellekens, Christian J;Matassoni, Marco;Omologo, Maurizio
Speech recognition in reverberant environments using remote microphones
ISM 2006, 8th IEEE International Symposium on Multimedia, December 11-13, 2006, San Diego, USA
Details
BibTeX
DOI
|
| 2133 |
Brayda, Luca;Wellekens, Christian J;Omologo, Maurizio
N-Best parallel maximum likelihood beamformers for robust speech recognition
EUSIPCO 2006, European Signal Processing Conference, September 4-8, 2006, Firenze, Italy
Details
BibTeX
|
| 2054 |
Brayda, Luca;Wellekens, Christian J;Omologo, Maurizio
Improving robustness of a likelihood-based beamformer in a real environment for automatic speech recognition
SPECOM'2006, 11th International Conference Speech and Computer, June 25-29, 2006, Saint-Petersburg, Russia
, pp 50-53
Details
BibTeX
|
| 2132 |
Brayda, Luca;Wellekens, Christian J;Omologo, Maurizio
Reconnaissance robuste de parole en environnement reél à l'aide d'un réseau de microphones à formation de voie adaptative basée sur un critère des N-best vraisemblance maximale
JEP 2006, Journees d'Etudes sur la Parole , 12-16 juin 2006, Dinard, France
Details
BibTeX
DOI
|
| Eurecom Reference | 2005 |
| 1637 |
Brayda, Luca;Bertotti, Claudio;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
Modifications on NIST MarkIII array to improve coherence properties among input signals
AES 2005, 118th Audio Engineering Society Convention, May 28-31, 2005, Barcelona, Spain
Details
BibTeX
|
| 1612 |
Brayda, Luca;Bertotti, Claudio;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
On calibration and coherence signal analysis of the CHIL microphone network at IRST
Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, March 17-18, 2005, Piscataway, USA
Details
BibTeX
|
| 1611 |
Bertotti, Claudio;Brayda, Luca;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
The new MarkIII/IRST-Light microphone array
Research report RR-05-130
Details
BibTeX
|
| Eurecom Reference | 2004 |
| 1610 |
Bertotti, Claudio;Brayda, Luca;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
The MarkIII microphone array: the modified version realized at ITC-irst
Research report RR-04-129
Details
BibTeX
|
| 1452 |
Brayda, Luca Giulio;Rigazio, Luca;Boman, Robert;Junqua, Jean-Claude
Sensitivity analysis of noise robustness methods
ICASSP 2004, 29th IEEE International Conference on Acoustics, Speech, and Signal Processing, May 17-21, 2004, Montreal, Canada
Details
BibTeX
DOI
|
|
|