Perceptually motivated quasi-periodic signal selection for polyphonic music transcription

Triki, Mahdi;Slock, Dirk T M
ICASSP 2009, International conference on Acoustics, Speech and Signal Processing. April 19-24, 2009, Taipei, Taiwan

A multiple fundamental frequency estimator is a key building block in music transcription and indexing operations. However, systems trying to perform this task tend to be very complex [1]. Indeed, music transcription requires an analysis accounting for both physical and psycho-acoustical matters. In this work, we propose a physically-motivated audio signal analysis followed by an auditory-based selection. The audio signal model allows for a better time/frequency resolution tradeoff, while the auditory distance discards the redundant/non-relevant information. No prior information on the musical instrument, musical genre, and/or maximum polyphony are needed. Simulations show that the proposed technique achieves good transcription results for a variety of string and wind instruments. The proposed scheme is also shown to be robust in the presence of noise, percussive sounds and in unbalanced Signal-to-Interference Ratio (SIR) situations. 

Systèmes de Communication
Eurecom Ref:
© 2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.