A corpus-based approach to video indexing for TV news