Features for multimodal emotion recognition : An extensive study

Paleari, Marco; Chellali, Ryad; Huet, Benoit
CIS 2010, IEEE International Conference on Cybernetics and Intelligent Systems, June 28-30, 2010, Singapore

The ability to recognize emotions in natural human communications is known to be very important for mankind. In recent years, a considerable number of researchers have investigated techniques allowing computer to replicate this capability by analyzing both prosodic (voice) and facial expressions. The applications of the resulting systems are manifold and range from gaming to indexing and retrieval, through chat and health care. No study has, to the best of our knowledge, ever reported results comparing the effectiveness of several features for automatic emotion recognition. In this work, we present an extensive study conducted on feature selection for automatic, audio-visual, realtime, and person independent emotion recognition. More than 300,000 different neural networks have been trained in order to compare the performances of 64 features and 11 different sets of features with 450 different analysis settings. Results show that: 1) to build an optimal emotion recognition system, different emotions should be classified via different features and 2) different features, in general, require different processing.


DOI
Type:
Conférence
Date:
2010-06-28
Department:
Data Science
Eurecom Ref:
3106
Copyright:
© 2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
See also:

PERMALINK : https://www.eurecom.fr/publication/3106