Graduate School and Research Center in Digital Sciences

Exploring two spaces with one feature: Kernelized multidimensional modeling of visual alphabets

Redi, Miriam; Mérialdo, Bernard

ICMR 2012, ACM International Conference on Multimedia Retrieval, June 5-8, Hong Kong, China

Marginal Alphabets (MEDA) were proposed as an alternative to Bag of Words (BoW) for image representation. They aggregate sets of locally extracted descriptors (LEDs) by using visual alphabets based on the marginal approximation of the LED components. Compared to the exponential complexity of the BoW codebooks, the MEDA model is very efficient because each dimension of the LED is quantized independently. However, MEDA lacks of considering the relations between the LED components, loosing precious information for image representation. In this paper, we design Multi-MEDA, a shift-invariant kernel for MEDA signatures that allows to reintroduce, at a kernel level, the connections between LED components that were broken with the independent quantization. With our approach, we can derive in a polynomial time a multivariate model from the marginal approximations stored in the MEDA vector, without explicitly computing any multidimensional codebook. Results show that the MEDA signature increases its discriminative power when analyzed throug the Multi-MEDA kernel evaluation. Moreover, we show that the model generated my the Multi-MEDA-based learning brings complementary information compared to traditional kernels over MEDA and BoW signatures: our experiments on the TRECVID database show that the combination of these approaches brings a substantial improvment compared to BoW-only classification.

Document Doi Bibtex

Title:Exploring two spaces with one feature: Kernelized multidimensional modeling of visual alphabets
Keywords:Scene Recognition, Feature Extraction, CBIR
Type:Conference
Language:English
City:Hong Kong
Country:CHINA
Date:
Department:Data Science
Eurecom ref:3681
Copyright: © ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ICMR 2012, ACM International Conference on Multimedia Retrieval, June 5-8, Hong Kong, China http://dx.doi.org/10.1145/2324796.2324821
Bibtex: @inproceedings{EURECOM+3681, doi = {http://dx.doi.org/10.1145/2324796.2324821}, year = {2012}, title = {{E}xploring two spaces with one feature: {K}ernelized multidimensional modeling of visual alphabets }, author = {{R}edi, {M}iriam and {M}{\'e}rialdo, {B}ernard}, booktitle = {{ICMR} 2012, {ACM} {I}nternational {C}onference on {M}ultimedia {R}etrieval, {J}une 5-8, {H}ong {K}ong, {C}hina}, address = {{H}ong {K}ong, {CHINA}}, month = {06}, url = {http://www.eurecom.fr/publication/3681} }
See also: