Generative Models for Effective ML on Private, Decentralized Datasets

Sean Augenstein, H. Brendan McMahan, Daniel Ramage, Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, Rajiv Mathews, Blaise Aguera y Arcas

Keywords: federated learning, gan, generative models, privacy, security

Wed Session 1 (05:00-07:00 GMT) [Live QA] [Cal]
Wed Session 4 (17:00-19:00 GMT) [Live QA] [Cal]

Abstract: To improve real-world applications of machine learning, experienced modelers develop intuition about their datasets, their models, and how the two interact. Manual inspection of raw data—of representative samples, of outliers, of misclassifications—is an essential tool in a) identifying and fixing problems in the data, b) generating new modeling hypotheses, and c) assigning or refining human-provided labels. However, manual data inspection is risky for privacy-sensitive datasets, such as those representing the behavior of real-world individuals. Furthermore, manual data inspection is impossible in the increasingly important setting of federated learning, where raw examples are stored at the edge and the modeler may only access aggregated outputs such as metrics or model parameters. This paper demonstrates that generative models—trained using federated methods and with formal differential privacy guarantees—can be used effectively to debug data issues even when the data cannot be directly inspected. We explore these methods in applications to text with differentially private federated RNNs and to images using a novel algorithm for differentially private federated GANs.

Similar Papers

Differentially Private Meta-Learning
Jeffrey Li, Mikhail Khodak, Sebastian Caldas, Ameet Talwalkar,
Federated Adversarial Domain Adaptation
Xingchao Peng, Zijun Huang, Yizhe Zhu, Kate Saenko,
Discrepancy Ratio: Evaluating Model Performance When Even Experts Disagree on the Truth
Igor Lovchinsky, Alon Daks, Israel Malkin, Pouya Samangouei, Ardavan Saeedi, Yang Liu, Swami Sankaranarayanan, Tomer Gafner, Ben Sternlieb, Patrick Maher, Nathan Silberman,