Media Search and Retrieval

Issuing a textual query for a search within a media collection is a task that is familiar to the Internet users nowadays. The systems performing this search are usually based on the available metadata (i.e. title or tags provided by the content creator or users) or more recently (but still seldom) the actual content of the media item (i.e. transcript of the video, visual concepts, etc…). The link between the given textual description of the query provided by users, or of the required visual content, and the multimodal features that can be automatically extracted for all the media items in the collection has not been to date thoroughly investigated.

As part of our efforts to address users’ needs in such interactive systems we are paying attention to the intention gap which arises from the difficulty for the retrieval system to interpret accurately the user's query. Indeed, the query formulated by the user is in natural language while the search is performed over multiple system specific dimensions. To this hand, we are investigating novel automatic approaches to map the information provided by the user in the form of a textual description (the query) to the metadata available with corpus (i.e. aligning visual cues available in the query to visual concepts detectors).

Hyper Video Browser Interface

Syndicate

Syndicate content

Data Science