Online pattern learning for non-negative convolutive sparse coding accepted for publication

Wang, Dong; Vipperla, Ravichander; Evans, Nicholas

INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication, August 28-31, Florence, Italy

The unsupervised learning of spectro-temporal speech patterns is relevant in a broad range of tasks. Convolutive non-negative matrix factorization (CNMF) and its sparse version, convolutive non-negative sparse coding (CNSC), are powerful, related tools. A particular difﬁculty of CNMF/CNSC, however, is the high demand on computing power and memory, which can prohibit their application to large scale tasks. In this paper, we propose an online algorithm for CNMF and CNSC, which processes input data piece-by-piece and updates the learned patterns after the processing of each piece by using accumulated sufﬁcient statistics. The online CNSC algorithm remarkably increases converge speed of the CNMF/CNSC pattern learning, thereby enabling its application to large scale tasks.

The unsupervised learning of spectro-temporal speech patterns is relevant in a broad range of tasks. Convolutive non-negative matrix factorization (CNMF) and its sparse version, convolutive non-negative sparse coding (CNSC), are powerful, related tools. A particular difficulty of CNMF/CNSC, however, is the high demand on computing power and memory, which can prohibit their application to large scale tasks. In this paper, we propose an online algorithm for CNMF and CNSC, which processes input data piece-by-piece and updates the learned patterns after the processing of each piece by using accumulated sufficient statistics. The online CNSC algorithm remarkably increases converge speed of the CNMF/CNSC pattern learning, thereby enabling its application to large scale tasks.

Detail

Document

BIBTEX

Type:

Conference

City:

Florence

Date:

2011-08-28

Department:

Digital Security

Eurecom Ref:

3409

© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication, August 28-31, Florence, Italy and is available at :