Multi-attribute suppression for speech privacy preservation

Fonction
Département
Date
07-2024
Reference
PhD Position – Thesis offer M/F (Reference: SN/NE/PhD/SPPR/072024)

Description

Recordings of speech contain much more than the spoken content (words) but also, e.g., the voice identity, sex, health and emotional state, ethnic origin, geographical background, social identity and socio-economic status. Given the potential for such personal, private information to be estimated from speech data and then used for nefarious purposes, we are need of privacy preservation solutions tailored to the speech medium. In moving far beyond studies of voice anonymisation which aim to obfuscate only the voice identity [1], the SpeechPrivacy project, funded by the French National Research Agency (ANR) is working to develop a flexible solution to privacy preservation based on isolated/disentangled representations and the selective obfuscation/modification of multiple attributes. Such a solution would enable, e.g., the user of a smart speech technology service to choose for themselves which privacy-sensitive attributes should or should be provided to the service provider. The user will be able to select the attributes to be disclosed and those to be protected so that speech recordings sanitised of the selected privacy-sensitive attributes can then be entrusted to other parties without endangering user privacy.

EURECOM offers a fully-funded PhD position to work on the development of a framework for adversarial disentanglement and multiple attribute obfuscation. The work will begin with a study of single attribute  isolation and representation. Before moving to multiple attribute obfuscation, we will need first to understand impacts of single attribute obfuscation upon other, potentially entangled attributes. This preliminary work will enable us to establish upper bounds on the potential for disentanglement. We will then investigate encoder-decoder frameworks  using which an input speech signal can be sanitised of information relating to a set of selected, sensitive attributes. The idea is to build a bank of workers, each tasked with the learning of representations for a set of privacy-sensitive attributes. Each worker can then be combined with a set of auxiliary adversarial co-workers to encourage the learning of disentangled representations using which we can synthesize an output speech signal sanitised of the selected attributes, e.g. a new speech signal without age-related or sex-related information. The successful candidate will join the Audio Security and Privacy Group within EURECOM’s Digital Security Department. You will work under the supervision of Profs. Nicholas Evans and Massimiliano Todisco and in collaboration with SpeechPrivacy partners, the Laboratoire Informatique d'Avignon (LIA) and the Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA).

[1] Tomashenko, N., Srivastava, B.M.L., Wang, X., Vincent, E., Nautsch, A., Yamagishi, J., Evans, N., Patino, J., Bonastre, J.-F., Noé, P.-G., Todisco, M. (2020),  “Introducing the VoicePrivacy Initiative”, in Proc. Interspeech 2020, 1693-1697, available from https://www.isca-archive.org/interspeech_2020/tomashenko20_interspeech…

Requirements

  • Education Level / Degree : Master’s degree
  • Field / specialty:  Computer Science, Artificial Intelligence, Speech Processing
  • Technologies / languages / systems: machine learning, deep learning, Python and PyTorch
  • Other skills / specialties: strong mathematics, analytical, problem solving, communications and writing skills
  • Other important elements: an excellent academic track record, proficiency in English

Application

The application must include:

  • Detailed curriculum,
  • Motivation letter of two pages also presenting the perspectives of research and education,
  • Name and address of three references.

Applications should be submitted by e-mail to  secretariat@eurecom.fr with the reference:  SN/NE/PhD/SPPR/072024

Start date: Sept./Oct. 2024
Duration: Duration of the thesis

 

More info