Differential Privacy in Practice: Expose your Epsilons! (original) (raw)

Differential Privacy Made Easy

2022

Data privacy is a major issue for many decades, several techniques have been developed to make sure individuals' privacy but still world has seen privacy failures. In 2006, Cynthia Dwork gave the idea of Differential Privacy which gave strong theoretical guarantees for data privacy. Many companies and research institutes developed differential privacy libraries, but in order to get the differentially private results, users have to tune the privacy parameters. In this paper, we minimized these tune-able parameters. The DP-framework is developed which compares the differentially private results of three Python based DP libraries. We also introduced a new very simple DP library (GRAM-DP), so the people with no background of differential privacy can still secure the privacy of the individuals in the dataset while releasing statistical results in public.

Differential Privacy and Machine Learning: a Survey and Review

The objective of machine learning is to extract useful information from data, while privacy is preserved by concealing information. Thus it seems hard to reconcile these competing interests. However, they frequently must be balanced when mining sensitive data. For example, medical research represents an important application where it is necessary both to extract useful information and protect patient privacy. One way to resolve the conflict is to extract general characteristics of whole populations without disclosing the private information of individuals. In this paper, we consider differential privacy, one of the most popular and powerful definitions of privacy. We explore the interplay between machine learning and differential privacy, namely privacy-preserving machine learning algorithms and learning-based data release mechanisms. We also describe some theoretical results that address what can be learned differentially privately and upper bounds of loss functions for differentia...

Differential Privacy: What is all the noise about?

arXiv (Cornell University), 2022

Differential Privacy (DP) is a formal definition of privacy that provides rigorous guarantees against risks of privacy breaches during data processing. It makes no assumptions about the knowledge or computational power of adversaries, and provides an interpretable, quantifiable and composable formalism. DP has been actively researched during the last 15 years, but it is still hard to master for many Machine Learning (ML)) practitioners. This paper aims to provide an overview of the most important ideas, concepts and uses of DP in ML, with special focus on its intersection with Federated Learning (FL).

Diffprivlib: The IBM Differential Privacy Library

2019

Since its conception in 2006, differential privacy has emerged as the de-facto standard in data privacy, owing to its robust mathematical guarantees, generalised applicability and rich body of literature. Over the years, researchers have studied differential privacy and its applicability to an ever-widening field of topics. Mechanisms have been created to optimise the process of achieving differential privacy, for various data types and scenarios. Until this work however, all previous work on differential privacy has been conducted on a ad-hoc basis, without a single, unifying codebase to implement results. In this work, we present the IBM Differential Privacy Library, a general purpose, open source library for investigating, experimenting and developing differential privacy applications in the Python programming language. The library includes a host of mechanisms, the building blocks of differential privacy, alongside a number of applications to machine learning and other data anal...

OpenDP : An Open-Source Suite of Differential Privacy Tools

2019

Project Goal We propose to lead a community effort to build a system of tools for enabling privacy-protective analysis of sensitive personal data, focused on an open-source library of algorithms for generating differentially private statistical releases. We aim for this platform, OpenDP, to become the standard body of trusted and open-source implementations of differentially private algorithms for statistical analysis and machine learning on sensitive data, and a pathway that rapidly brings the newest algorithmic developments to a wide array of practitioners.

Differential Privacy

Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages - POPL '15, 2015

Differential privacy provides a way to get useful information about sensitive data without revealing much about any one individual. It enjoys many nice compositionality properties not shared by other approaches to privacy, including, in particular, robustness against side-knowledge. Designing differentially private mechanisms from scratch can be a challenging task. One way to make it easier to construct new differential private mechanisms is to design a system which allows more complex mechanisms (programs) to be built from differentially private building blocks in principled way, so that the resulting programs are guaranteed to be differentially private by construction. This paper is about a new accounting principle for building differentially private programs. It is based on a simple generalisation of classic differential privacy which we call Personalised Differential Privacy (PDP). In PDP each individual has its own personal privacy level. We describe ProPer, a interactive system for implementing PDP which maintains a privacy budget for each individual. When a primitive query is made on data derived from individuals, the provenance of the involved records determines how the privacy budget of an individual is affected: the number of records derived from Alice determines the multiplier for the privacy decrease in Alice's budget. This offers some advantages over previous systems, in particular its fine-grained character allows better utilisation of the privacy budget than mechanisms based purely on the concept of global sensitivity, and it applies naturally to the case of a live database where new individuals are added over time. We provide a formal model of the ProPer approach, prove that it provides personalised differential privacy, and describe a prototype implementation based on McSherry's PINQ system.

Differential privacy under fire

2011

Anonymizing private data before release is not enough to reliably protect privacy, as Netflix and AOL have learned to their cost. Recent research on differential privacy opens a way to obtain robust, provable privacy guarantees, and systems like PINQ and Airavat now offer convenient frameworks for processing arbitrary userspecified queries in a differentially private way. However, these systems are vulnerable to a variety of covertchannel attacks that can be exploited by an adversarial querier. We describe several different kinds of attacks, all feasible in PINQ and some in Airavat. We discuss the space of possible countermeasures, and we present a detailed design for one specific solution, based on a new primitive we call predictable transactions and a simple differentially private programming language. Our evaluation, which relies on a proof-of-concept implementation based on the Caml Light runtime, shows that our design is effective against remotely exploitable covert channels, at the expense of a higher query completion time. Disciplines Disciplines Computer Sciences Comments Comments

Differential Privacy in Privacy-Preserving Big Data and Learning: Challenge and Opportunity

SVCC 2021: Silicon Valley Cybersecurity Conference, 2021

Differential privacy (DP) has become the de facto standard of privacy preservation due to its strong protection and sound mathematical foundation, which is widely adopted in different applications such as big data analysis, graph data process, machine learning, deep learning, and federated learning. Although DP has become an active and influential area, it is not the best remedy for all privacy problems in different scenarios. Moreover, there are also some misunderstanding, misuse, and great challenges of DP in specific applications. In this paper, we point out a series of limits and open challenges of corresponding research areas. Besides, we offer potentially new insights and avenues on combining differential privacy with other effective dimension reduction techniques and secure multiparty computing to clearly define various privacy models.

Extending the Foundations of Differential Privacy: Robustness and Flexibility

2019

In this work, we make foundational contributions to the area of Differential Privacy (DP). Our first contribution is definitional. We define two complementary concepts that greatly enhance the applicability of DP, namely, robust privacy and flexible accuracy. Robust privacy requires that a mechanism provides the “best possible privacy” without further degrading accuracy guarantees, even if such privacy is not a priori anticipated based on input neighborhoods alone. Flexible accuracy allows small distortions in the input (e.g., dropping outliers) before measuring accuracy of the output. Along the way, we also extend the notion of DP to sampling (i.e., computation of randomized functions). Our second contribution is in establishing versatile composition theorems that relate these notions. Our third contribution is constructive: We present mechanisms that can help in achieving these notions, where previously no meaningful differentially private mechanisms were possible. In particular, ...

Explode: An Extensible Platform for Differentially Private Data Analysis

2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

Differential privacy (DP) has emerged as a popular standard for privacy protection and received great attention from the research community. However, practitioners often find DP cumbersome to implement, since it requires additional protocols (e.g., for randomized response, noise addition) and changes to existing database systems. To avoid these issues we introduce Explode, a platform for differentially private data analysis. The power of Explode comes from its ease of deployment and use: The data owner can install Explode on top of an SQL server, without modifying any existing components. Explode then hosts a web application that allows users to conveniently perform many popular data analysis tasks through a graphical user interface, e.g., issuing statistical queries, classification, correlation analysis. Explode automatically converts these tasks to collections of SQL queries, and uses the techniques in [3] to determine the right amount of noise that should be added to satisfy DP while producing high utility outputs. This paper describes the current implementation of Explode, together with potential improvements and extensions.