Contact Us

Collecting Speech and Telemetry Data Privately with Ali Shahim Shamsabadi

Key Details:

Time: 11.00 – 12.00

Date: Friday, 15 November

Location: Online (MS Teams)

To request a link e-mail ix-contact@imperial.ac.uk

Registration is
now closed

Speaker

Ali Shahin Shamsabadi

Ali Shahin Shamsabadi is a privacy researcher at Brave. Before joining Brave, Ali was a Research Associate at The Alan Turing Institute, and a Postdoctoral Fellow at Vector Institute. His current research interests focus on i) identifying and mitigating failure modes of AI systems; and ii) building confidential and reliable auditing frameworks. His research has been published at top-tier conferences including NeurIPS, ICLR, CVPR, CCS, USENIX Security and PETs. He recently designed and launched Nebula, a novel system for protecting users in product analytics with formal differential privacy guarantees.

Talk Title

Collecting Speech and Telemetry Data Privately

Talk Summary

The advancement of AI-driven services relies heavily on extensive data derived from our daily interactions. Specifically, speech data is collected by service providers and third-party contractors to process user queries and to train voice-based systems using real and diverse datasets. Similarly, collecting telemetry data is crucial for developers to analyze how users interact with features, enabling the optimization of web experiences.

This talk addresses three critical questions.

Q1: Why should we care about privacy? 

I will discuss the privacy risks associated with sharing speech and telemetry data.

Q2: How can we define privacy? 

I will define differential privacy as the gold standard for privacy guarantees.

Q3: How does differential privacy help? 

I will present two systems: i) Differentially Private Speaker Anonymization: Enables users to share their speech data with service providers for both inference and training purposes while concealing their identity. ii) Differentially Private Histogram Estimation of Telemetry Data: Allows for gaining valuable insights into product usage and user feedback from a population without revealing any individual user choices or behaviors.

More Events

Feb
04

This talk will provide a brief overview of Optimal Transport (OT) and its uses in the development of Machine Learning applications, with the aim of encouraging the adoption of the OT toolbox by those using AI/ML tools in their scientific research and demonstrations of the main OT algorithms in the form of Jupyter Notebooks.