Provably Personalized and Robust Federated Learning (original) (raw)

Towards Federated Learning with Byzantine-Robust Client Weighting

Applied Sciences

Federated learning (FL) is a distributed machine learning paradigm where data are distributed among clients who collaboratively train a model in a computation process coordinated by a central server. By assigning a weight to each client based on the proportion of data instances it possesses, the rate of convergence to an accurate joint model can be greatly accelerated. Some previous works studied FL in a Byzantine setting, in which a fraction of the clients may send arbitrary or even malicious information regarding their model. However, these works either ignore the issue of data unbalancedness altogether or assume that client weights are a priori known to the server, whereas, in practice, it is likely that weights will be reported to the server by the clients themselves and therefore cannot be relied upon. We address this issue for the first time by proposing a practical weight-truncation-based preprocessing method and demonstrating empirically that it is able to strike a good bala...

Straggler-Resilient Personalized Federated Learning

Cornell University - arXiv, 2022

Federated Learning is an emerging learning paradigm that allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions. Despite its success, federated learning faces several challenges related to its decentralized nature. In this work, we develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles, namely (i) data heterogeneity, i.e., data distributions can vary substantially across clients, and (ii) system heterogeneity, i.e., the computational power of the clients could differ significantly. Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client. Furthermore, our method mitigates the effects of stragglers by adaptively selecting clients based on their computational characteristics and statistical significance, thus achieving, for the first time, near optimal sample complexity and provable logarithmic speedup. Experimental results support our theoretical findings showing the superiority of our method over alternative personalized federated schemes in system and data heterogeneous environments.

Exploration and Exploitation in Federated Learning to Exclude Clients with Poisoned Data

2022 International Wireless Communications and Mobile Computing (IWCMC), 2022

Federated Learning (FL) is one of the hot research topics, and it utilizes Machine Learning (ML) in a distributed manner without directly accessing private data on clients. However, FL faces many challenges, including the difficulty to obtain high accuracy, high communication cost between clients and the server, and security attacks related to adversarial ML. To tackle these three challenges, we propose an FL algorithm inspired by evolutionary techniques. The proposed algorithm groups clients randomly in many clusters, each with a model selected randomly to explore the performance of different models. The clusters are then trained in a repetitive process where the worst performing cluster is removed in each iteration until one cluster remains. In each iteration, some clients are expelled from clusters either due to using poisoned data or low performance. The surviving clients are exploited in the next iteration. The remaining cluster with surviving clients is then used for training the best FL model (i.e., remaining FL model). Communication cost is reduced since fewer clients are used in the final training of the FL model. To evaluate the performance of the proposed algorithm, we conduct a number of experiments using FEMNIST dataset and compare the result against the random FL algorithm. The experimental results show that the proposed algorithm outperforms the baseline algorithm in terms of accuracy, communication cost, and security.

Network-Level Adversaries in Federated Learning

arXiv (Cornell University), 2022

Federated learning is a popular strategy for training models on distributed, sensitive data, while preserving data privacy. Prior work identified a range of security threats on federated learning protocols that poison the data or the model. However, federated learning is a networked system where the communication between clients and server plays a critical role for the learning task performance. We highlight how communication introduces another vulnerability surface in federated learning and study the impact of network-level adversaries on training federated learning models. We show that attackers dropping the network traffic from carefully selected clients can significantly decrease model accuracy on a target population. Moreover, we show that a coordinated poisoning campaign from a few clients can amplify the dropping attacks. Finally, we develop a server-side defense which mitigates the impact of our attacks by identifying and up-sampling clients likely to positively contribute towards target accuracy. We comprehensively evaluate our attacks and defenses on three datasets, assuming encrypted communication channels and attackers with partial visibility of the network.

FedDec: Peer-to-peer Aided Federated Learning

arXiv (Cornell University), 2023

Federated learning (FL) has enabled training machine learning models exploiting the data of multiple agents without compromising privacy. However, FL is known to be vulnerable to data heterogeneity, partial device participation, and infrequent communication with the server, which are nonetheless three distinctive characteristics of this framework. While much of the recent literature has tackled these weaknesses using different tools, only a few works have explored the possibility of exploiting interagent communication to improve FL's performance. In this work, we present FedDec, an algorithm that interleaves peer-to-peer communication and parameter averaging (similar to decentralized learning in networks) between the local gradient updates of FL. We analyze the convergence of FedDec under the assumptions of non-iid data distribution, partial device participation, and smooth and strongly convex costs, and show that inter-agent communication alleviates the negative impact of infrequent communication rounds with the server by reducing the dependence on the number of local updates H from O(H 2) to O(H). Furthermore, our analysis reveals that the term improved in the bound is multiplied by a constant that depends on the spectrum of the inter-agent communication graph, and that vanishes quickly the more connected the network is. We confirm the predictions of our theory in numerical simulations, where we show that FedDec converges faster than FedAvg, and that the gains are greater as either H or the connectivity of the network increase.

SignGuard: Byzantine-robust Federated Learning through Collaborative Malicious Gradient Filtering

ArXiv, 2021

Gradient-based training in federated learning is known to be vulnerable to faulty/malicious worker nodes, which are often modeled as Byzantine clients. To this end, previous work either makes use of auxiliary data at parameter server to verify the received gradients (e.g., by computing validation error rate) or leverages statistic-based methods (e.g. median and Krum) to identify and remove malicious gradients from Byzantine clients. In this paper, we acknowledge that auxiliary data may not always be available in practice and focus on the statistic-based approach. However, recent work on model poisoning attacks have shown that well-crafted attacks can circumvent most of existing medianand distance-based statistical defense methods, making malicious gradients indistinguishable from honest ones. To tackle this challenge, we show that the element-wise sign of gradient vector can provide valuable insight in detecting model poisoning attacks. Based on our theoretical analysis of stateof-t...

Byzantine-Robust Federated Machine Learning through Adaptive Model Averaging

ArXiv, 2019

Federated learning enables training collaborative machine learning models at scale with many participants whilst preserving the privacy of their datasets. Standard federated learning techniques are vulnerable to Byzantine failures, biased local datasets, and poisoning attacks. In this paper we introduce Adaptive Federated Averaging, a novel algorithm for robust federated learning that is designed to detect failures, attacks, and bad updates provided by participants in a collaborative model. We propose a Hidden Markov Model to model and learn the quality of model updates provided by each participant during training. In contrast to existing robust federated learning schemes, we propose a robust aggregation rule that detects and discards bad or malicious local model updates at each training iteration. This includes a mechanism that blocks unwanted participants, which also increases the computational and communication efficiency. Our experimental evaluation on 4 real datasets show that ...

Adaptive Personalized Federated Learning

ArXiv, 2020

Investigation of the degree of personalization in federated learning algorithms has shown that only maximizing the performance of the global model will confine the capacity of the local models to personalize. In this paper, we advocate an adaptive personalized federated learning (APFL) algorithm, where each client will train their local models while contributing to the global model. Information theoretically, we prove that the mixture of local and global models can reduce the generalization error. We also propose a communication-reduced bilevel optimization method, which reduces the communication rounds to O(sqrtT)O(\sqrt{T})O(sqrtT) and show that under strong convexity and smoothness assumptions, the proposed algorithm can achieve a convergence rate of O(1/T)O(1/T)O(1/T) with some residual error. The residual error is related to the gradient diversity among local models, and the gap between optimal local and global models. The extensive experiments demonstrate the effectiveness of our personalization, as ...

A Four-Pronged Defense Against Byzantine Attacks in Federated Learning

Proceedings of the 31st ACM International Conference on Multimedia

Federated learning (FL) is a nascent distributed learning paradigm to train a shared global model without violating users' privacy. FL has been shown to be vulnerable to various Byzantine attacks, where malicious participants could independently or collusively upload well-crafted updates to deteriorate the performance of the global model. However, existing defenses could only mitigate part of Byzantine attacks, without providing an all-sided shield for FL. It is difficult to simply combine them as they rely on totally contradictory assumptions. In this paper, we propose FPD, a four-pronged defense against both non-colluding and colluding Byzantine attacks. Our main idea is to utilize absolute similarity to filter updates rather than relative similarity used in existingI works. To this end, we first propose a reliable client selection strategy to prevent the majority of threats in the bud. Then we design a simple but effective score-based detection method to mitigate colluding attacks. Third, we construct an enhanced spectral-based outlier detector to accurately discard abnormal updates when the training data is not independent and

Shielding Federated Learning: Robust Aggregation with Adaptive Client Selection

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

Federated learning (FL) enables multiple clients to collaboratively train an accurate global model while protecting clients' data privacy. However, FL is susceptible to Byzantine attacks from malicious participants. Although the problem has gained significant attention, existing defenses have several flaws: the server irrationally chooses malicious clients for aggregation even after they have been detected in previous rounds; the defenses perform ineffectively against sybil attacks or in the heterogeneous data setting. To overcome these issues, we propose MAB-RFL, a new method for robust aggregation in FL. By modelling the client selection as an extended multi-armed bandit (MAB) problem, we propose an adaptive client selection strategy to choose honest clients that are more likely to contribute high-quality updates. We then propose two approaches to identify malicious updates from sybil and non-sybil attacks, based on which rewards for each client selection decision can be accur...