With the advent of mobile and smart-home devices such as Amazon Alexa or the Google Assistant providing voice-based interfaces, voice data is commonly transferred to corresponding cloud services. This is necessary to quickly and accurately perform tasks like automatic speaker verification (ASV) and speech recognition (ASR) that heavily rely on machine learning. While enabling intriguing new applications, this development also poses significant risks: Voice data is highly sensitive since it contains biometric information of the speaker as well as the spoken words. Thus, the security and privacy of billions of end-users is at stake if voice data is not protected properly. When developing privacy-preserving solutions to mitigate such risks, it is also important to keep in mind that the involved machine learning models represent intellectual property of the service providers and therefore must not be revealed to users. The contribution of our work is three-fold: First, we present an efficient architecture for privacy-preserving ASV via outsourced secure two-party computation (STPC). Compared to existing solutions based on homomorphic encryption (HE), the verification process is 4,000x faster, while retaining a high verification accuracy and guaranteeing unlinkability, irreversibility, and renewability of stored biometric data. Since cryptographic secure computation protocols currently do not scale to more involved tasks like ASR, we then present VoiceGuard, an architecture that efficiently protects speech processing inside a trusted execution environment (TEE). We provide a proof-of-concept implementation and evaluate it on speech recognition tasks isolated with Intel SGX, a widely available TEE implementation, demonstrating even real time processing capabilities. Finally, we present Offline Model Guard (OMG) to enable privacypreserving speech processing on the predominant mobile computing platform ARM even in offline scenarios. Beyond relying on the Intel SGX equivalent ARM TrustZone, we employ the security architecture SANCTUARY (NDSS’19) for strict hardware-enforced isolation from all other system components. Our prototype implementation performs privacy-preserving keyword recognition using TensorFlow Lite in real time.
Privacy-preserving speech processing via STPC and TEEs
PPML 2019, Privacy Preserving Machine Learning Workshop, CCS 2019 Workshop, November 15, 2019, London, UK
PERMALINK : https://www.eurecom.fr/publication/6098