Sex conversion in speech involves privacy risks from data collection and often leaves residual sex-specific cues in outputs, even when target speaker references are unavailable. We introduce RASO for Reference-free Adversarial Sex Obfuscation. Innovations include a sex-conditional adversarial learning framework to disentangle linguistic content from sex-related acoustic markers and explicit regularisation to align fundamental frequency distributions and formant trajectories with sex-neutral characteristics learned from sex-balanced training data. RASO preserves linguistic content and, even when assessed under a semiinformed attack model, it significantly outperforms a competing approach to sex obfuscation.
Reference-free adversarial sex obfuscation in speech
APSIPA 2025, 17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 22-24 October 2025, Shangri-la, Singapore
Type:
Conference
City:
Shangri-la
Date:
2025-10-22
Department:
Digital Security
Eurecom Ref:
8317
Copyright:
© 2025 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
PERMALINK : https://www.eurecom.fr/publication/8317