Video content modeling with latent semantic analysis

Souvannavong, Fabrice;Mérialdo, Bernard;Huet, Benoit

CBMI 2003, 3rd International Workshop on Content-Based Multimedia Indexing, September 22-24, 2003, Rennes, France

In this paper we present a novel approach to fully automatic video content modelling. We introduce the concept of visual dictionary to describe visual video elements, called words, which appear through video sequences. Their cooccurrences in contexts, i.e. the main video entity to be indexed (frame, shot, scene, _____ ), compose signatures usable for indexing and comparison. Latent Semantic Analysis (LSA) is naturally introduced to improve the robustness to noise and discover the latent semantic. This new representation along with its associated similarity measure, has many applications including indexing, retrieval, summarization or enhanced navigation, on single as well as multiple video sequences. Once the framework is presented, we investigate three methods to efficiently exploit the information provided by multiple features in order to improve the video analysis. Promising results were obtained on the object and frame retrieval tasks across a single video document.

Detail

Document

BIBTEX

Type:

Conference

City:

Rennes

Date:

2003-09-22

Department:

Data Science

Eurecom Ref:

1313

© 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.