Oblivious Multi-Party Machine Learning on Trusted Processors (original) (raw)

Privacy-Preserving Inference in Machine Learning Services Using Trusted Execution Environments

2019

This work presents Origami, which provides privacy-preserving inference for large deep neural network (DNN) models through a combination of enclave execution, cryptographic blinding, interspersed with accelerator-based computation. Origami partitions the ML model into multiple partitions. The first partition receives the encrypted user input within an SGX enclave. The enclave decrypts the input and then applies cryptographic blinding to the input data and the model parameters. Cryptographic blinding is a technique that adds noise to obfuscate data. Origami sends the obfuscated data for computation to an untrusted GPU/CPU. The blinding and de-blinding factors are kept private by the SGX enclave, thereby preventing any adversary from denoising the data, when the computation is offloaded to a GPU/CPU. The computed output is returned to the enclave, which decodes the computation on noisy data using the unblinding factors privately stored within SGX. This process may be repeated for each...

Confidential Training and Inference using Secure Multi-Party Computation on Vertically Partitioned Dataset

Scalable Computing: Practice and Experience

Digitalization across all spheres of life has given rise to issues like data ownership and privacy. Privacy-Preserving Machine Learning (PPML), an active area of research, aims to preserve privacy for machine learning (ML) stakeholders like data owners, ML model owners, and inference users. The Paper, CoTraIn-VPD, proposes private ML inference and training of models for vertically partitioned datasets with Secure Multi-Party Computation (SPMC) and Differential Privacy (DP) techniques. The proposed approach addresses complications linked with the privacy of various ML stakeholders dealing with vertically portioned datasets. This technique is implemented in Python using open-source libraries such as SyMPC (SMPC functions), PyDP (DP aggregations), and CrypTen (secure and private training). The paper uses information privacy measures, including mutual information and KL-Divergence, across different privacy budgets to empirically demonstrate privacy preservation with high ML accuracy and...

Confidential machine learning on untrusted platforms: a survey

Cybersecurity, 2021

With the ever-growing data and the need for developing powerful machine learning models, data owners increasingly depend on various untrusted platforms (e.g., public clouds, edges, and machine learning service providers) for scalable processing or collaborative learning. Thus, sensitive data and models are in danger of unauthorized access, misuse, and privacy compromises. A relatively new body of research confidentially trains machine learning models on protected data to address these concerns. In this survey, we summarize notable studies in this emerging area of research. With a unified framework, we highlight the critical challenges and innovations in outsourcing machine learning confidentially. We focus on the cryptographic approaches for confidential machine learning (CML), primarily on model training, while also covering other directions such as perturbation-based approaches and CML in the hardware-assisted computing environment. The discussion will take a holistic way to consi...

Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning

Proceedings 2020 Network and Distributed System Security Symposium, 2020

Machine learning has started to be deployed in fields such as healthcare and finance, which involves dealing with a lot of sensitive data. This propelled the need for and growth of privacy-preserving machine learning (PPML). We propose an actively secure four-party protocol (4PC), and a framework for PPML, showcasing its applications on four of the most widely-known machine learning algorithms-Linear Regression, Logistic Regression, Neural Networks, and Convolutional Neural Networks.

FLASH: Fast and Robust Framework for Privacy-preserving Machine Learning

Proceedings on Privacy Enhancing Technologies, 2020

Privacy-preserving machine learning (PPML) via Secure Multi-party Computation (MPC) has gained momentum in the recent past. Assuming a minimal network of pair-wise private channels, we propose an efficient four-party PPML framework over rings ℤ2ℓ, FLASH, the first of its kind in the regime of PPML framework, that achieves the strongest security notion of Guaranteed Output Delivery (all parties obtain the output irrespective of adversary’s behaviour). The state of the art ML frameworks such as ABY3 by Mohassel et.al (ACM CCS’18) and SecureNN by Wagh et.al (PETS’19) operate in the setting of 3 parties with one malicious corruption but achieve the weaker security guarantee of abort. We demonstrate PPML with real-time efficiency, using the following custom-made tools that overcome the limitations of the aforementioned state-of-the-art– (a) dot product, which is independent of the vector size unlike the state-of-the-art ABY3, SecureNN and ASTRA by Chaudhari et.al (ACM CCSW’19), all of wh...

Efficient Secure Building Blocks With Application to Privacy Preserving Machine Learning Algorithms

IEEE Access, 2021

Nowadays different entities (such as hospitals, cyber security companies, banks, etc.) collect data of the same nature but often with different statistical properties. It has been shown that if these entities combine their privately collected datasets to train a machine learning model, they would end up with a trained model that often outperforms the human experts of the corresponding field(s) in terms of classification accuracy. However, due to judicial, privacy and cost reasons, no entity is willing to share their data with others. We have the same problem during the classification (inference) stage. Namely, the user doesn't want to reveal any information about his query or its' final classification, while the owner of the trained model wants to keep this model private. In this article we overcome these drawbacks by firstly introducing novel efficient secure building blocks for general purpose, which can also be used to build privacy preserving machine learning algorithms for both training and classification (inference) purposes under strict privacy and security requirements. Our theoretical analysis and experimentation results show that our building blocks (hence also our privacy preserving algorithms which are built on top of them) are more efficient than most (if not all) of the state-of-the-art schemes in terms of computation and communication cost, as well as security characteristics in the semi-honest model. Furthermore, and to the best of our knowledge, for the Naïve Bayes model we extend this efficiency for the first time to also deal with active malicious users, which arbitrarily deviate from the protocol.

New Directions in Efficient Privacy-Preserving Machine Learning

2020

Applications of machine learning have become increasingly common in recent years. For instance, navigation systems like Google Maps use machine learning to better predict traffic patterns; Facebook, LinkedIn, and other social media platforms use machine learning to customize user's news feeds. Central to all these systems is user data. However, the sensitive nature of the collected data has also led to a number of privacy concerns. Privacy-preserving machine learning enables systems that can

Confidential Machine Learning Computation in Untrusted Environments: A Systems Security Perspective

IEEE Access, 2021

As machine learning (ML) technologies and applications are rapidly changing many domains of computing, security issues associated with ML are also emerging. In the domain of systems security, many endeavors have been made to ensure ML model and data confidentiality. ML computations are often inevitably performed in untrusted environments and entail complex multi-party security requirements. Hence, researchers have leveraged the Trusted Execution Environments (TEEs) to build confidential ML computation systems. This paper conducts a systematic and comprehensive survey by classifying attack vectors and mitigation in TEE-protected confidential ML computation in the untrusted environment, analyzes the multi-party ML security requirements, discusses related engineering challenges, and suggests future research directions.

Privacy-Preserving Distributed Support Vector Machines

Heterogeneous Data Management, Polystores, and Analytics for Healthcare, 2021

Federated machine learning is a promising paradigm allowing organizations to collaborate toward the training of a joint model without the need to explicitly share sensitive or business-critical datasets. Previous works demonstrated that such paradigm is not sufficient to preserve confidentiality of the training data, even to honest participants. In this work, we extend a well-known framework for training sparse Support Vector Machines in a distributed setting, while preserving data confidentiality by means of a novel non-interactive secure multiparty computation engine, that preserves data confidentiality. We formally demonstrate the security properties of the engine and provide, by means of extensive empirical evaluation, the performance of the extended framework both in terms of accuracy and execution time.

PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting

2021

When multiple parties that deal with private data aim for a collaborative prediction task such as medical image classification, they are often constrained by data protection regulations and lack of trust among collaborating parties. If done in a privacy-preserving manner, predictive analytics can benefit from the collective prediction capability of multiple parties holding complementary datasets on the same machine learning task. This paper presents PRICURE, a system that combines complementary strengths of secure multi-party computation (SMPC) and differential privacy (DP) to enable privacy-preserving collaborative prediction among multiple model owners. SMPC enables secret-sharing of private models and client inputs with non-colluding secure servers to compute predictions without leaking model parameters and inputs. DP masks true prediction results via noisy aggregation so as to deter a semi-honest client who may mount membership inference attacks. We evaluate PRICURE on neural ne...