Entropy based supervised merging for visual categorization

Niaz, Usman; Mérialdo, Bernard
ACIVS 2012, Advanced Concepts for Intelligent Vision Systems, 4 September 2012, Brno University of Technology, Brno, Czech Republic / Also published in LNCS, Volume 7517/2012, Springer

Bag Of visual Words (BoW) is widely regarded as the standard representation of visual information present in the images and is broadly used for retrieval and concept detection in videos. The generation of visual vocabulary in the BoW framework generally includes a

quantization step to cluster the image features into a limited number of visual words. This quantization achieved through unsupervised clustering does not take any advantage of the relationship between the features coming from images belonging to similar concept(s), thus enlarging the semantic gap. We present a new dictionary construction technique to improve the BoW representation by increasing its discriminative power. Our solution is based on a two step quantization: we start with k-means clustering followed by a bottom-up supervised clustering using features' label information. Results on the TRECVID 2007 data [8] show improvements with the proposed construction of the BoW. We equally give upperbounds of improvement over the baseline for the retrieval rate of each concept using the best supervised merging criteria.


DOI
Type:
Conference
City:
Brno
Date:
2012-09-04
Department:
Data Science
Eurecom Ref:
3758
Copyright:
© Springer. Personal use of this material is permitted. The definitive version of this paper was published in ACIVS 2012, Advanced Concepts for Intelligent Vision Systems, 4 September 2012, Brno University of Technology, Brno, Czech Republic / Also published in LNCS, Volume 7517/2012, Springer and is available at : http://dx.doi.org/10.1007/978-3-642-33140-4_37

PERMALINK : https://www.eurecom.fr/publication/3758