Simeon - Secure Federated Machine Learning Through Iterative Filtering (original) (raw)
Related papers
Byzantine-Robust Federated Machine Learning through Adaptive Model Averaging
ArXiv, 2019
Federated learning enables training collaborative machine learning models at scale with many participants whilst preserving the privacy of their datasets. Standard federated learning techniques are vulnerable to Byzantine failures, biased local datasets, and poisoning attacks. In this paper we introduce Adaptive Federated Averaging, a novel algorithm for robust federated learning that is designed to detect failures, attacks, and bad updates provided by participants in a collaborative model. We propose a Hidden Markov Model to model and learn the quality of model updates provided by each participant during training. In contrast to existing robust federated learning schemes, we propose a robust aggregation rule that detects and discards bad or malicious local model updates at each training iteration. This includes a mechanism that blocks unwanted participants, which also increases the computational and communication efficiency. Our experimental evaluation on 4 real datasets show that ...
Efficient Verifiable Protocol for Privacy-Preserving Aggregation in Federated Learning
IEEE Transactions on Information Forensics and Security
Federated learning has gained extensive interest in recent years owing to its ability to update model parameters without obtaining raw data from users, which makes it a viable privacy-preserving machine learning model for collaborative distributed learning among various devices. However, due to the fact that adversaries can track and deduce private information about users from shared gradients, federated learning is vulnerable to numerous security and privacy threats. In this work, a communication-efficient protocol for secure aggregation of model parameters in a federated learning setting is proposed where training is done on user devices while the aggregated trained model could be constructed on the server side without revealing the raw data of users. The proposed protocol is robust against users' dropouts, and it enables each user to independently validate the aggregated result supplied by the server. The suggested protocol is secure in an honest-but-curious environment, and privacy is maintained even if the majority of parties are in collusion. A practical scenario for the proposed setting is discussed. Additionally, a simulation of the protocol is evaluated, and results demonstrate that it outperforms one of the state-of-art protocols, especially when the number of dropouts increases.
Shielding Federated Learning: Robust Aggregation with Adaptive Client Selection
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Federated learning (FL) enables multiple clients to collaboratively train an accurate global model while protecting clients' data privacy. However, FL is susceptible to Byzantine attacks from malicious participants. Although the problem has gained significant attention, existing defenses have several flaws: the server irrationally chooses malicious clients for aggregation even after they have been detected in previous rounds; the defenses perform ineffectively against sybil attacks or in the heterogeneous data setting. To overcome these issues, we propose MAB-RFL, a new method for robust aggregation in FL. By modelling the client selection as an extended multi-armed bandit (MAB) problem, we propose an adaptive client selection strategy to choose honest clients that are more likely to contribute high-quality updates. We then propose two approaches to identify malicious updates from sybil and non-sybil attacks, based on which rewards for each client selection decision can be accur...
Privacy-Preserving Decentralized Aggregation for Federated Learning
IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), 2021
Federated learning is a promising framework for learning over decentralized data spanning multiple regions. This approach avoids expensive central training data aggregation cost and can improve privacy because distributed sites do not have to reveal privacy-sensitive data. In this paper, we develop a privacy-preserving decentralized aggregation protocol for federated learning. We formulate the distributed aggregation protocol with the Alternating Direction Method of Multiplier (ADMM) and examine its privacy weakness. Unlike prior work that use Differential Privacy or homomorphic encryption for privacy, we develop a protocol that controls communication among participants in each round of aggregation to minimize privacy leakage. We establish its privacy guarantee against an honest-but-curious adversary. We also propose an efficient algorithm to construct such a communication pattern, inspired by combinatorial block design theory. Our secure aggregation protocol based on this novel group communication pattern design leads to an efficient algorithm for federated training with privacy guarantees. We evaluate our federated training algorithm on image classification and next-word prediction applications over benchmark datasets with 9 and 15 distributed sites. Evaluation results show that our algorithm performs comparably to the standard centralized federated learning method while preserving privacy; the degradation in test accuracy is only up to 0.73%.
Preserving Privacy and Security in Federated Learning
IEEE/ACM Transactions on Networking
Federated learning is known to be vulnerable to security and privacy issues. Existing research has focused either on preventing poisoning attacks from users or on protecting user privacy of model updates. However, integrating these two lines of research remains a crucial challenge since they often conflict with one another with respect to the threat model. In this work, we develop a framework to combine secure aggregation with defense mechanisms against poisoning attacks from users, while maintaining their respective privacy guarantees. We leverage zero-knowledge proof protocol to let users run the defense mechanisms locally and attest the result to the central server without revealing any information about their model updates. Furthermore, we propose a new secure aggregation protocol for federated learning using homomorphic encryption that is robust against malicious users. Our framework enables the central server to identify poisoned model updates without violating the privacy guarantees of secure aggregation. Finally, we analyze the computation and communication complexity of our proposed solution and benchmark its performance.
Client-specific Property Inference against Secure Aggregation in Federated Learning
arXiv (Cornell University), 2023
Federated learning has become a widely used paradigm for collaboratively training a common model among different participants with the help of a central server that coordinates the training. Although only the model parameters or other model updates are exchanged during the federated training instead of the participant's data, many attacks have shown that it is still possible to infer sensitive information or to reconstruct participant data. Although differential privacy is considered an effective solution to protect against privacy attacks, it is also criticized for its negative effect on utility. Another possible defense is to use secure aggregation, which allows the server to only access the aggregated update instead of each individual one, and it is often more appealing because it does not degrade the model quality. However, combining only the aggregated updates, which are generated by a different composition of clients in every round, may still allow the inference of some client-specific information. In this paper, we show that simple linear models can effectively capture client-specific properties only from the aggregated model updates due to the linearity of aggregation. We formulate an optimization problem across different rounds in order to infer a tested property of every client from the output of the linear models, for example, whether they have a specific sample in their training data (membership inference) or whether they misbehave and attempt to degrade the performance of the common model by poisoning attacks. Our reconstruction technique is completely passive and undetectable. We demonstrate the efficacy of our approach on several scenarios, showing that secure aggregation provides very limited privacy guarantees in practice. The source code is available at https://github.com/raouf-kerkouche/PROLIN.
Median-Krum: A Joint Distance-Statistical Based Byzantine-Robust Algorithm in Federated Learning
Proceedings of the Int'l ACM Symposium on Mobility Management and Wireless Access
The wide spread of Artificial Intelligence-based services in recent years has encouraged research into new Machine Learning paradigms. Federated Learning (FL) represents a new distributed approach capable of achieving higher privacy and security guarantees than other methodologies since it allows multiple users to collaboratively train a global model without sharing their local training data. In this paper, an analysis of the characteristics of Federated Learning is therefore carried out, with a particular focus on security aspects. In detail, currently known vulnerabilities and their respective countermeasures are investigated, focusing on aggregation algorithms that provide robustness against Byzantine failures. Following this direction, Median-Krum is proposed as a new aggregation algorithm whose validity is observed on a set of simulations that recreate realistic scenarios, in the absence and presence of Byzantine adversaries. It combines the Distance-based Krum approach with the Statistical strategy of median based aggregation algorithm. Achieved results demonstrate the functionality of the proposed solutions in terms of accuracy and convergence rounds in comparison with FedAvg, Krum, Multi-Krum and Fed-Median FL approaches under a correct and incorrect estimation of the attackers number. CCS CONCEPTS • Security and privacy → Privacy-preserving protocols; • Computing methodologies → Machine learning approaches.
A Four-Pronged Defense Against Byzantine Attacks in Federated Learning
Proceedings of the 31st ACM International Conference on Multimedia
Federated learning (FL) is a nascent distributed learning paradigm to train a shared global model without violating users' privacy. FL has been shown to be vulnerable to various Byzantine attacks, where malicious participants could independently or collusively upload well-crafted updates to deteriorate the performance of the global model. However, existing defenses could only mitigate part of Byzantine attacks, without providing an all-sided shield for FL. It is difficult to simply combine them as they rely on totally contradictory assumptions. In this paper, we propose FPD, a four-pronged defense against both non-colluding and colluding Byzantine attacks. Our main idea is to utilize absolute similarity to filter updates rather than relative similarity used in existingI works. To this end, we first propose a reliable client selection strategy to prevent the majority of threats in the bud. Then we design a simple but effective score-based detection method to mitigate colluding attacks. Third, we construct an enhanced spectral-based outlier detector to accurately discard abnormal updates when the training data is not independent and
Dynamic Defense Against Byzantine Poisoning Attacks in Federated Learning
2020
Federated learning, as a distributed learning that conducts the training on the local devices without accessing to the training data, is vulnerable to Byzatine poisoning adversarial attacks. We argue that the federated learning model has to avoid those kind of adversarial attacks through filtering out the adversarial clients by means of the federated aggregation operator. We propose a dynamic federated aggregation operator that dynamically discards those adversarial clients and allows to prevent the corruption of the global learning model. We assess it as a defense against adversarial attacks deploying a deep learning classification model in a federated learning setting on the Fed-EMNIST Digits, Fashion MNIST and CIFAR-10 image datasets. The results show that the dynamic selection of the clients to aggregate enhances the performance of the global learning model and discards the adversarial and poor (with low quality models) clients.
Towards Federated Learning with Byzantine-Robust Client Weighting
Applied Sciences
Federated learning (FL) is a distributed machine learning paradigm where data are distributed among clients who collaboratively train a model in a computation process coordinated by a central server. By assigning a weight to each client based on the proportion of data instances it possesses, the rate of convergence to an accurate joint model can be greatly accelerated. Some previous works studied FL in a Byzantine setting, in which a fraction of the clients may send arbitrary or even malicious information regarding their model. However, these works either ignore the issue of data unbalancedness altogether or assume that client weights are a priori known to the server, whereas, in practice, it is likely that weights will be reported to the server by the clients themselves and therefore cannot be relied upon. We address this issue for the first time by proposing a practical weight-truncation-based preprocessing method and demonstrating empirically that it is able to strike a good bala...