Online pattern learning for non-negative convolutive sparse coding accepted for publication

Wang, Dong; Vipperla, Ravichander; Evans, Nicholas
INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication, August 28-31, Florence, Italy

 

 

The unsupervised learning of spectro-temporal speech patterns is relevant in a broad range of tasks. Convolutive non-negative matrix factorization (CNMF) and its sparse version, convolutive non-negative sparse coding (CNSC), are powerful, related tools. A particular difficulty of CNMF/CNSC, however, is the high demand on computing power and memory, which can prohibit their application to large scale tasks. In this paper, we propose an online algorithm for CNMF and CNSC, which processes input data piece-by-piece and updates the learned patterns after the processing of each piece by using accumulated sufficient statistics. The online CNSC algorithm remarkably increases converge speed of the CNMF/CNSC pattern learning, thereby enabling its application to large scale tasks.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The unsupervised learning of spectro-temporal speech patterns is relevant in a broad range of tasks. Convolutive non-negative matrix factorization (CNMF) and its sparse version, convolutive non-negative sparse coding (CNSC), are powerful, related tools. A particular difficulty of CNMF/CNSC, however, is the high demand on computing power and memory, which can prohibit their application to large scale tasks. In this paper, we propose an online algorithm for CNMF and CNSC, which processes input data piece-by-piece and updates the learned patterns after the processing of each piece by using accumulated sufficient statistics. The online CNSC algorithm remarkably increases converge speed of the CNMF/CNSC pattern learning, thereby enabling its application to large scale tasks.


Type:
Conference
City:
Florence
Date:
2011-08-28
Department:
Digital Security
Eurecom Ref:
3409
Copyright:
© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication, August 28-31, Florence, Italy and is available at :

PERMALINK : https://www.eurecom.fr/publication/3409