Tell me why? Ain't nothin' but a mistake? Describing media item differences with media fragments uri and speech synthesis

Steiner, Thomas; Troncy, Raphaël
ICME 2013, IEEE International Workshop on Media fragment creation and reMIXing (MMIX 2013), 15-19 July 2013, San Jose, CA, USA

We have developed a tile-wise histogram-based media item deduplication algorithm with additional high-level semantic matching criteria that is tailored to photos and videos gathered from multiple social networks. In this paper, we investigate whether the Media Fragments URI addressing scheme together with a natural language generation framework realized through a text-to-speech system provides a feasible and practicable way to visually and audially describe the differences between media items of type photo and/or video, so that human-friendly debugging of the deduplication algorithm is made possible. A short screencast illustrating the approach is available online at

San Jose
Data Science
Eurecom Ref:
© 2013 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
See also: