Graduate School and Research Center in Digital Sciences

IRIM at TRECVID 2012: Semantic indexing and instance search

Ballas, Nicolas; Labbe, Benjamin; Shabou, Aymen; Le Borgne, Herve; Gosselin, Philippe; Redi, Miriam; Merialdo, Bernard; Jegou, Hervé; Delhumeau, Jonathan; Vieux, Rémi; Mansencal, Boris; Benois-Pineau, Jenny; Ayache, Stéphane; Haadi, Abdelkader; Safadi, Bahjat; Thollard, Franck; Derbas, Nadia; Quenot, Georges; Bredin, Hervé; Cord, Matthieu; Gao, Boyang; Zhu, Chao; Tang, Yuxing; Dellandreav, Emmanuel; Bichot, Charles-Edmond; Chen, Liming; Benoît, Alexandre; Lambert, Patrick; Strat, Tiberius; Razik, Joseph; Paris, Sebastion; Glotin, Hervé; Trung, Tran Ngo; Petrovska, Dijana; Chollet, Geerard; Stoian, Andrei; Crucianu, Michel

TRECVID 2012, TREC Video Retrieval Evaluation workshop, November 2012, Gaithersburg, MD, United States

The IRIM group is a consortium of French teams working on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2012 semantic indexing and instance search tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classi cation, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of different descriptors and tried di erent fusion strategies. The best IRIM run has a Mean Inferred Average Precision of 0.2378, which ranked us 4th out of 16 participants. For the instance search task, our approach uses two steps. First individual methods of participants are used to compute similrity between an example image of instance and keyframes of a video clip. Then a two-step fusion method is used to combine these individual results and obtain a score for the likelihood of an instance to appear in a video clip. These scores are used to obtain a ranked list of clips the most likely to contain the queried instance. The best IRIM run has a MAP of 0.1192, which ranked us 29th on 79 fully automatic runs.

Document Hal Bibtex

Title:IRIM at TRECVID 2012: Semantic indexing and instance search
Type:Conference
Language:English
City:Gaithersburg
Country:UNITED STATES
Date:
Department:Data Science
Eurecom ref:4536
Copyright: © NIST. Personal use of this material is permitted. The definitive version of this paper was published in TRECVID 2012, TREC Video Retrieval Evaluation workshop, November 2012, Gaithersburg, MD, United States and is available at :
Bibtex: @inproceedings{EURECOM+4536, year = {2012}, title = {{IRIM} at {TRECVID} 2012: {S}emantic indexing and instance search}, author = {{B}allas, {N}icolas and {L}abbe, {B}enjamin and {S}habou, {A}ymen and {L}e {B}orgne, {H}erve and {G}osselin, {P}hilippe and {R}edi, {M}iriam and {M}erialdo, {B}ernard and {J}egou, {H}erv{\'e} and {D}elhumeau, {J}onathan and {V}ieux, {R}{\'e}mi and {M}ansencal, {B}oris and {B}enois-{P}ineau, {J}enny and {A}yache, {S}t{\'e}phane and {H}aadi, {A}bdelkader and {S}afadi, {B}ahjat and {T}hollard, {F}ranck and {D}erbas, {N}adia and {Q}uenot, {G}eorges and {B}redin, {H}erv{\'e} and {C}ord, {M}atthieu and {G}ao, {B}oyang and {Z}hu, {C}hao and {T}ang, {Y}uxing and {D}ellandreav, {E}mmanuel and {B}ichot, {C}harles-{E}dmond and {C}hen, {L}iming and {B}eno{\^i}t, {A}lexandre and {L}ambert, {P}atrick and {S}trat, {T}iberius and {R}azik, {J}oseph and {P}aris, {S}ebastion and {G}lotin, {H}erv{\'e} and {T}rung, {T}ran {N}go and {P}etrovska, {D}ijana and {C}hollet, {G}eerard and {S}toian, {A}ndrei and {C}rucianu, {M}ichel }, booktitle = {{TRECVID} 2012, {TREC} {V}ideo {R}etrieval {E}valuation workshop, {N}ovember 2012, {G}aithersburg, {MD}, {U}nited {S}tates}, address = {{G}aithersburg, {UNITED} {STATES}}, month = {11}, url = {http://www.eurecom.fr/publication/4536} }
See also: