Seamless navigation in audio files

Wellekens, Christian J
ODYSSEY 2001: a speaker odyssey, June 18-22, 2001, Crete, Greece

New audio services require editing tools for audio files. Indexing is a solution for fast access to specific information which could be speaker identity, location of speaker intervention on the file, topic identification. Good editing tools for text files have been available for many years and a solution for seamless navigation in an audio file could be the recognition of the content of the file to be edited (speech to text) but this requires in general, large vocabulary speaker independent recognizers giving acceptable results only for cooperative speakers restricting their speech to a domain for which a language model can be learned. Also even in that case, detection of musical chunks, intervention of a given speaker and segmentation in speakers remain interesting challenges. Mastering the complete indexing techniques will open the market for appealing consumer applications producing audio (but also video) programs on demand. Clearly access to multimedia databases and multimedia archives will be easier.

Invited paper in a conference
Sécurité numérique
Eurecom Ref:
© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in ODYSSEY 2001: a speaker odyssey, June 18-22, 2001, Crete, Greece and is available at :