Toward multimodal fusion of affective cues

Paleari, Marco; Lisetti, Christine Laetitia

HCM 2006, 1st International Workshop on Human Centered Multimedia at ACM Multimedia 2006, October 23-27, 2006, Santa Barbara, USA

During face to face communication, it has been suggested that as much as 70% of what people communicate when talking directly with others is through paralanguage involving multiple modalities combined together (e.g. voice tone and volume, body language). In an attempt to render human computer interaction more similar to human-human communication and enhance its naturalness, research on sensory acquisition and interpretation of single modalities of human expressions have seen ongoing progress over the last decade. These progresses are rendering current research on artificial sensor fusion of multiple modalities an increasingly important research domain in order to reach better accuracy of congruent messages on the one hand, and possibly to be able to detect incongruent messages across multiple modalities (incongruency being itself a message about the nature of the information being conveyed). Accurate interpretation of emotional signals - quintessentially multimodal - would hence particularly benefit from multimodal sensor fusion and interpretation algorithms. In this paper we provide a state of the art multimodal fusion and describe one way to implement a generic framework for multimodal emotion recognition. The system is developed within the MAUI framework [31] and Scherer's Component Process Theory (CPT) [49, 50, 51, 24, 52], with the goal to be modular and adaptive. We want the designed framework to be able to accept different single and multi modality recognition systems and to automatically adapt the fusion algorithm to find optimal solutions. The system also aims to be adaptive to channel (and system) reliability

Detail

Document

DOI

BIBTEX

Type:

Conference

City:

Santa Barbara

Date:

2006-10-23

Department:

Data Science

Eurecom Ref:

2031

© ACM, 2006. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in HCM 2006, 1st International Workshop on Human Centered Multimedia at ACM Multimedia 2006, October 23-27, 2006, Santa Barbara, USA http://dx.doi.org/10.1145/1178745.1178762