Wanying Ge, Massimiliano Todisco, Nicholas Evans
ODYSSEY 2022, The Speaker and Language Recognition Workshop, 28 June-1st July 2022, Beijing, China
Abstract: Despite several years of research in deepfake and spoofing detection for automatic speaker verification, little is known about the artefacts that classifiers use to distinguish between bona fide and spoofed utterances. An understanding of these is crucial to the design of trustworthy, explainable solutions. In this paper we report an extension of our previous work to better understand classifier behaviour to the use of SHapley Additive exPlanations (SHAP) to attack analysis. Our goal is to identify the artefacts that characterise utterances generated by different attacks algorithms. Using a pair of classifiers which operate either upon raw waveforms or magnitude spectrograms, we show that visualisations of SHAP results can be used to identify attack-specific artefacts and the differences and consistencies between synthetic speech and converted voice spoofing attacks.