Toward multimodal fusion of affective cues