Nebula: Efficient, Private and Accurate Histogram Estimation (original) (raw)
Abstract:We present \textit{Nebula}, a system for differentially private histogram estimation on data distributed among clients. \textit{Nebula} allows clients to independently decide whether to participate in the system, and locally encode their data so that an untrusted server only learns data values whose multiplicity exceeds a predefined aggregation threshold, with (varepsilon,delta)(\varepsilon,\delta)(varepsilon,delta) differential privacy guarantees. Compared to existing systems, \textit{Nebula} uniquely achieves: \textit{i)} a strict upper bound on client privacy leakage; \textit{ii)} significantly higher utility than standard local differential privacy systems; and \textit{iii)} no requirement for trusted third-parties, multi-party computation, or trusted hardware. We provide a formal evaluation of \textit{Nebula}'s privacy, utility and efficiency guarantees, along with an empirical assessment on three real-world datasets. On the United States Census dataset, clients can submit their data in just 0.0036 seconds and 0.0016 MB (\textbf{efficient}), under strong (varepsilon=1,delta=10−8)(\varepsilon=1,\delta=10^{-8})(varepsilon=1,delta=10−8) differential privacy guarantees (\textbf{private}), enabling \textit{Nebula}'s untrusted aggregation server to estimate histograms with over 88\% better utility than existing local differential privacy deployments (\textbf{accurate}). Additionally, we describe a variant that allows clients to submit multi-dimensional data, with similar privacy, utility, and performance. Finally, we provide an implementation of \textit{Nebula}.
Submission history
From: Ali Shahin Shamsabadi [view email]
[v1] Sun, 15 Sep 2024 09:55:18 UTC (245 KB)
[v2] Thu, 3 Jul 2025 14:13:44 UTC (129 KB)