Français English
         
 
   
A- / A+ / plug_site_print
Multimedia Communications Back
Luca BRAYDA
PhD
Email Luca BRAYDA
Resume

Luca Brayda received in November 2003 the M.S. in computer science engineering from Politecnico di Torino, Italy and the M.S. degree in computer vision from Université de Nice Sophia-Antipolis, France.

 

He did an internship at Panasonic Speech Technology Laboratory in Santa Barbara, CA, USA, from April to September 2003, working on signal and model-based noise compensation methods for robust automatic speech recognition.

 

He started a Ph.D. thesis at the Eurécom Institute on robust speech recognition with microphone arrays with Professor Christian Wellekens.

 

 

He obtained his PhD (UNSA) on the 24th april 2007:

 

Title:

“Multiple Hypothesis Feedback for Robust Speech Recognition with a Microphone Array input”


Abstract :
Recognizing speech in real environments is as much difficult as the amount of noise increases and the speaker is far from the microphone. Recent studies showed that speech quality in terms of signal to noise ratio (SNR) can be increased using microphone arrays. By exploiting the spatial correlation among multi-channel signals, one can steer the array toward the speaker (beamforming).

This can be done by simply exploiting inter-channel destructive interference of noise with a delay-and-sum technique, where inter-sensor delays are estimated and applied to each channel signal. Alternatively, per-channel filters (filter-and-sum) can be implemented: these filters can be fixed or adapted on a per-channel or per-frame basis, depending on the chosen criterion. In this work we address the problem that increasing the SNR does not imply increasing recognition performance to the same extent. Seltzer (2004) proposes to apply an adaptive filter-and-sum beamformer based on a Maximum Likelihood criterion (Limabeam) rather than on the SNR. In this method, filters are adapted in an unsupervised way using clean speech models which best align noisy speech features. Then the recognizer uses the sum of the filtered signals to generate a final transcription. In this thesis we show that considering in parallel N-best hypotheses instead of the best one, prior to optimization, can increase recognition performance close to that of a supervised algorithm: in fact after the parallel optimizations the N-best list is automatically re-ranked and recognition errors can be recovered. The framework of the N-best Limabeam was tested when significant additive noise is present. Furthermore, the potential of delay-and-sum beamforming, of Limabeam and of the proposed framework was studied in a very reverberant meeting room, where the collected database mimic different talker positions and head orientations: the purpose is to estimate recognition-oriented filters or exploiting additional information related to the environment such as the room impulse responses.

 

 

 

Publications
Eurecom Reference2007
2215 Brayda, Luca Giulio
Multiple hypothesis feedback for robust speech recognition with a microphone array input
Thesis
Details  BibTeX  File request 
Eurecom Reference2006
2131 Brayda, Luca;Wellekens, Christian J;Matassoni, Marco;Omologo, Maurizio
Speech recognition in reverberant environments using remote microphones
ISM 2006, 8th IEEE International Symposium on Multimedia, December 11-13, 2006, San Diego, USA
Details  BibTeX  DOI   
2133 Brayda, Luca;Wellekens, Christian J;Omologo, Maurizio
N-Best parallel maximum likelihood beamformers for robust speech recognition
EUSIPCO 2006, European Signal Processing Conference, September 4-8, 2006, Firenze, Italy
Details  BibTeX   
2054 Brayda, Luca;Wellekens, Christian J;Omologo, Maurizio
Improving robustness of a likelihood-based beamformer in a real environment for automatic speech recognition
SPECOM'2006, 11th International Conference Speech and Computer, June 25-29, 2006, Saint-Petersburg, Russia , pp 50-53
Details  BibTeX   
2132 Brayda, Luca;Wellekens, Christian J;Omologo, Maurizio
Reconnaissance robuste de parole en environnement reél à l'aide d'un réseau de microphones à formation de voie adaptative basée sur un critère des N-best vraisemblance maximale
JEP 2006, Journees d'Etudes sur la Parole , 12-16 juin 2006, Dinard, France
Details  BibTeX  DOI   
Eurecom Reference2005
1637 Brayda, Luca;Bertotti, Claudio;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
Modifications on NIST MarkIII array to improve coherence properties among input signals
AES 2005, 118th Audio Engineering Society Convention, May 28-31, 2005, Barcelona, Spain
Details  BibTeX   
1612 Brayda, Luca;Bertotti, Claudio;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
On calibration and coherence signal analysis of the CHIL microphone network at IRST
Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, March 17-18, 2005, Piscataway, USA
Details  BibTeX   
1611 Bertotti, Claudio;Brayda, Luca;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
The new MarkIII/IRST-Light microphone array
Research report RR-05-130
Details  BibTeX   
Eurecom Reference2004
1610 Bertotti, Claudio;Brayda, Luca;Cristoforetti, Luca;Omologo, Maurizio;Svaizer, Piergiorgio
The MarkIII microphone array: the modified version realized at ITC-irst
Research report RR-04-129
Details  BibTeX   
1452 Brayda, Luca Giulio;Rigazio, Luca;Boman, Robert;Junqua, Jean-Claude
Sensitivity analysis of noise robustness methods
ICASSP 2004, 29th IEEE International Conference on Acoustics, Speech, and Signal Processing, May 17-21, 2004, Montreal, Canada
Details  BibTeX  DOI