Exploring two spaces with one feature: Kernelized multidimensional modeling of visual alphabets

Redi, Miriam; Mérialdo, Bernard

ICMR 2012, ACM International Conference on Multimedia Retrieval, June 5-8, Hong Kong, China

Marginal Alphabets (MEDA) were proposed as an alternative to Bag of Words (BoW) for image representation. They aggregate sets of locally extracted descriptors (LEDs) by using visual alphabets based on the marginal approximation of the LED components. Compared to the exponential complexity of the BoW codebooks, the MEDA model is very efficient because each dimension of the LED is quantized independently. However, MEDA lacks of considering the relations between the LED components, loosing precious information for image representation. In this paper, we design Multi-MEDA, a shift-invariant kernel for MEDA signatures that allows to reintroduce, at a kernel level, the connections between LED components that were broken with the independent quantization. With our approach, we can derive in a polynomial time a multivariate model from the marginal approximations stored in the MEDA vector, without explicitly computing any multidimensional codebook. Results show that the MEDA signature increases its discriminative power when analyzed throug the Multi-MEDA kernel evaluation. Moreover, we show that the model generated my the Multi-MEDA-based learning brings complementary information compared to traditional kernels over MEDA and BoW signatures: our experiments on the TRECVID database show that the combination of these approaches brings a substantial improvment compared to BoW-only classification.

Detail

Document

DOI

BIBTEX

Type:

Conférence

City:

Hong Kong

Date:

2012-06-05

Department:

Data Science

Eurecom Ref:

3681

© ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ICMR 2012, ACM International Conference on Multimedia Retrieval, June 5-8, Hong Kong, China http://dx.doi.org/10.1145/2324796.2324821