Speaker-based segmentation for audio data indexing

Delacourt, Perrine;Kryze, David; Wellekens, Christian J
ISCA Workshop on accessing information in audio data, April 19-20, 1999, Cambridge, UK

In this paper, we address the problem of the speaker-based segmentation, which is the first necessary step for several indexing tasks. It consists in recognizing from their voice the sequence of people engaged in a conversation. In our context, we make no assumptions about prior knowledge of the speaker characteristics (no speaker model, no speech model, no training
phase). However, we assume that people do not speak simultaneously. Our segmentation technique takes advantages of two different types of segmentation algorithms.
It is organized in two passes: first, the most likely speaker changing points are detected and then,
they are validated or discarded. Our algorithm is efficient to detect speaker changing points even close to one another and is thus suited for segmenting conversations containing segments of any length.
 


Type:
Conference
City:
Cambridge
Date:
1999-04-19
Department:
Digital Security
Eurecom Ref:
176
Copyright:
© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in ISCA Workshop on accessing information in audio data, April 19-20, 1999, Cambridge, UK and is available at :

PERMALINK : https://www.eurecom.fr/publication/176