This year EURECOM participated in the TRECVID light Semantic Indexing (SIN) Task for the submission of four different runs for 50 concepts. Our submission builds on the runs submitted last year at the 2010 SIN task by adding more effective visual features to the third system built last year, the details of which can be found in . Two of our systems target specific objects based detection.
Our basic run adds densely extracted SIFT features to the pool of features of last year's basic run. The dense SIFT proves to be effective for concepts such as \Nighttime" accounting for a very low number of keypoints when extracted using a conventional log or hessian based detector. Then according to the third run from last year we add textual metadata based information that has been provided with the 2011 video database to the visual features. We improve the retrieval task by adding two more global descriptors to visual features with one capturing temporal statistics along a sequence of shots and the other capturing salient details or gist of an image. Further we enhance the visual recognition of some semantic concepts based on the detection of local objects like computer screens or scene text, and human detection like Male or Female persons.