Latent semantic indexing for semantic content detection of video shots

Souvannavong, Fabrice;Mérialdo, Bernard;Huet, Benoit

ICME 2004, IEEE International Conference on Multimedia and Expo, June 27-30, 2004, Taipei, Taiwan

Low-level features are now becoming insufficient to build efficient content-based retrieval systems. The interest of users is not anymore to retrieve visually similar content, but they expect that retrieval systems find documents with similar semantic content. Bridging the gap between low-level features and semantic content is a challenging task necessary for future retrieval systems. Latent Semantic Indexing (LSI) was successfully introduced to efficiently index text documents. In this paper we propose to adapt this technique to efficiently represent the visual content of video shots for semantic content detection. Although we restrict our approach to visual features, it can be extended with minor changes to audio and motion features to build a multi-modal system. The semantic content is then detected thanks to two classifiers: k-nearest neighbours and neural network classifiers. Finally, in the experimental section we show the performances of each classifier and the performance gain obtained with LSI features compared to traditional features.

Detail

Document

DOI

BIBTEX

Type:

Conférence

City:

Taipei

Date:

2004-06-27

Department:

Data Science

Eurecom Ref:

1404

© 2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.