Page 23 - EURECOM

VideoSense,

multimodal processing techniques to detect enriched concepts in video

V

ideoSense aims to use innovative

multimodal processing techniques

to detect and recognize enriched

concepts (including static, dynamic and

emotional aspects) in video sequences.

Targeted applications include advertisement

selection or video recommendation based on

the content of videos being watched by users.

The project builds on the expertise of partners

specialized in audio, image, video and text

processing to enhance existing descriptors. It

uses pivot languages to process multilingual

closed-captions, active learning techniques

to limit dependency on annotated learning

corpus, efficient detectors for static, dynamic

and emotional concepts, as well as fusion

mechanisms to reinforce classifiers’ results.

In this project, EURECOM is bringing its

expertise in video sequence analysis and

indexing by developing aspects related to

Kinectrevolution:

Beyond gaming!

Carmelo

Velardo

Nationality

Italian

Contact

carmelo.velardo@eurecom.fr

T

he RGBD (Red, Green, Blue, and Depth)

Kinect camera, originally conceived

to allow Natural User Interaction with

game consoles and PCs became the biggest

commercial success of the year 2011, with over

10 million units sold in less than five months.

Thanks to the individual efforts of Prime-

sense, Microsoft, and a team of hackers who

founded the project OpenKinect, a series of

hobbyist and researchers have started using

the new sensor to increase the potential of

many different kinds of applications.

Here at EURECOM, we are exploring the use

of Kinect for Biometric and Surveillance appli-

cations in the context of projects like VideoID

and ActiBio.

The possibility of sensing the 3D environ-

ment that surrounds the camera and the ability

of the Kinect to track someone’s body parts

empowers an automatic system that we deve-

loped to extract anthropometric measures.

Those measures are then used to extract Soft

Biometric information from a distance.

Based on this information we can estimate

the weight and gender of a subject in front of

the camera. To do so, we use a statistical model

built over the information extracted from the

NHANES database, a large American medical

database containing the records of more than

27,000 people.

The different approaches, whether in quan-

tity or quality offered by the Kinect capabilities

span many domains of application. With the

collaboration of the Centre for Space Human

Robotics of the Italian Institute of Technology

in Torino, we tried to solve a practical problem

faced by cosmonauts once in space. The lack of

gravity activates body mechanisms that cause

severe bone and lean body mass losses. This

forces cosmonauts to follow a regimen of exer-

cises and diet and to monitor their body mass,

which is a problem considering the weightless

conditions. This is why a system that could

visually estimates their mass is interesting.

The preliminary experiments we conducted to

explore the possibility of applying our research

outcomes to this problemhave shown that it is

possible. The performance is already close to

the systems currently on board the Internatio-

nal Space Station.

Other possible applications include, but

are not limited to the possibility of recognizing

someone based on his/her physical measure-

ments, or using the Kinect as a medical sup-

port for telemedicine thanks to the capabilities

enabled by our automatic weight estimation.

content representation through spatial-

temporal descriptors, and usual common

classifier adaptation to these new descriptors.

Project progress is assessed with a video

created by our industrial partner from an

existingWeb service. This enables us to

evaluate the impact of our algorithms on

a real application. The project also uses

the TRECVID campaign as a state-of-the-art

benchmark. The algorithms developed in the

project are readapted for enhanced efficiency

in computing resources, and implemented in

the production environment of the industrial

partner’s video server. The partner will

therefore benefit from these technologies in a

real Web service and improve its international

competitive edge.

Thesis advisor

Jean-Luc Dugelay

University of origin

Politecnico di Torino

contact:

Bernard.Merialdo@eurecom.fr

multimedia communications

young researcher

Page 23 - EURECOM - RA2011GB