Vocoder drift in x-vector–based speaker anonymization

Panariello, Michele; Todisco, Massimiliano; Evans, Nicholas
INTERSPEECH 2023, 24th Conference of the International Speech Communication Association, 20-24 August 2023, Dublin, Ireland

State-of-the-art approaches to speaker anonymization typically employ some form of perturbation function to conceal speaker information contained within an x-vector embedding, then resynthesize utterances in the voice of a new pseudo-speaker using a vocoder. Strategies to improve the x-vector anonymization function have attracted considerable research effort, whereas vocoder impacts are generally neglected. In this paper, we show that the impact of the vocoder is substantial and sometimes dominant. The vocoder drift, namely the difference between the x-vector vocoder input and that which can be extracted subsequently from the output, is learnable and can hence be reversed by an attacker; anonymization can be undone and the level of privacy protection provided by such approaches might be weaker than previously thought. The findings call into question the focus upon x-vector anonymization, prompting the need for greater attention to vocoder impacts and stronger attack models alike.


DOI
Type:
Conférence
City:
Dublin
Date:
2023-08-20
Department:
Sécurité numérique
Eurecom Ref:
7330
Copyright:
© ISCA. Personal use of this material is permitted. The definitive version of this paper was published in INTERSPEECH 2023, 24th Conference of the International Speech Communication Association, 20-24 August 2023, Dublin, Ireland and is available at : http://dx.doi.org/10.21437/Interspeech.2023-448

PERMALINK : https://www.eurecom.fr/publication/7330