Baseline systems for the first spoofing-aware speaker verification challenge: Score and embedding fusion

Shim, Hye-jin; Tak, Hemlata; Liu, Xuechen; Heo, Hee-Soo; Jung, Jee-weon; Chung, Joon Son; Chung, Soo-Whan; Yu, Ha-Jin; Lee, Bong-Jin; Todisco, Massimiliano; Delgado, Héctor; Lee, Kong Aik; Sahidullah, Md; Kinnunen, Tomi; Evans, Nicholas
ODYSSEY 2022, The Speaker Language Recognition Workshop, June 28th-July 1st, 2022, Beijing, China

Deep learning has brought impressive progress in the study of both automatic speaker verification (ASV) and spoofing countermeasures (CM). Although solutions are mutually dependent, they have typically evolved as standalone sub-systems whereby CM solutions are usually designed for a fixed ASV system. The work reported in this paper aims to gauge the improvements in reliability that can be gained from their closer integration. Results derived using the popular ASVspoof2019 dataset indicate that the equal error rate (EER) of a state-of-the-art ASV system degrades from 1.63% to 23.83% when the evaluation protocol is extended with spoofed trials.%subjected to spoofing attacks. However, even the straightforward integration of ASV and CM systems in the form of score-sum and deep neural network-based fusion strategies reduce the EER to 1.71% and 6.37%, respectively. The new Spoofing-Aware Speaker Verification (SASV) challenge has been formed to encourage greater attention to the integration of ASV and CM systems as well as to provide a means to benchmark different solutions.

 

DOI
HAL
Type:
Conference
City:
Beijing
Date:
2022-06-28
Department:
Digital Security
Eurecom Ref:
6880
Copyright:
© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in ODYSSEY 2022, The Speaker Language Recognition Workshop, June 28th-July 1st, 2022, Beijing, China and is available at : http://dx.doi.org/10.21437/Odyssey.2022-46

PERMALINK : https://www.eurecom.fr/publication/6880