IRIM at TRECVID 2011: Semantic indexing and instance search

Delezoide, Bertrand; Precioso, Frédéric; Gosselin, Philippe; Redi, Miriam; Mérialdo, Bernard; Granjon, Lionel; Pellerin, Denis; Rombaut, Michele; Jégou, Hervé; Vieux, Rémi; Mansencal, Boris; Benois-Pineau, Jenny; Ayache, Stéphane; Safadi, Bahjat; Thollard, Franck; Quénot, Georges; Bredin, Hervé; Cord, Matthieu; Benoıt, Alexandre; Lambert, Patrick; Strat, Tiberius; Razik, Joseph; Paris, Sébastion; Glotin, Hervé
TRECVID 2011, 15th International Workshop on Video Retrieval Evaluation, 2011, National Institute of Standards and Technology, Gaithersburg, USA


The IRIM group is a consortium of French teams working on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2011 semantic indexing and instance search tasks. For the semantic indexing task, our approach uses a six-stages

processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following

steps: descriptor extraction, descriptor optimization, classification, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of different descriptors and tried different fusion strategies. The best IRIM run has a Mean Inferred Average Pre-

cision of 0.1387, which ranked us 5th out of 19 participants. For the instance search task, we we used both object based query and frame based query. We formulated the query in standard way as comparison of visual signatures either of object with parts of DB frames or

as a comparison of visual signatures of query and DB frames. To produce visual signatures we also used two apporaches: the first one is the baseline Bag-Of-Visual-Words (BOVW) model based on SURF interest point descriptor; the second approach is a Bag-Of-Regions

(BOR) model that extends the traditional notion of BOVW vocabulary not only to keypoint-based descriptors but to region based descriptors.


HAL
Type:
Conference
City:
Gaithersburg
Date:
2011-11-07
Department:
Data Science
Eurecom Ref:
3680
Copyright:
© NIST. Personal use of this material is permitted. The definitive version of this paper was published in TRECVID 2011, 15th International Workshop on Video Retrieval Evaluation, 2011, National Institute of Standards and Technology, Gaithersburg, USA

and is available at :

PERMALINK : https://www.eurecom.fr/publication/3680