Praneeth Vepakomma | Massachusetts Institute of Technology (MIT) (original) (raw)
Uploads
Papers by Praneeth Vepakomma
Performing computations while maintaining privacy is an important problem in todays distributed m... more Performing computations while maintaining privacy is an important problem in todays distributed machine learning solutions. Consider the following two set ups between a client and a server, where in setup i) the client has a public data vector x, the server has a large private database of data vectors B and the client wants to find the inner products x, y k , ∀y k ∈ B. The client does not want the server to learn x while the server does not want the client to learn the records in its database. This is in contrast to another setup ii) where the client would like to perform an operation solely on its data, such as computation of a matrix inverse on its data matrix M, but would like to use the superior computing ability of the server to do so without having to leak M to the server. We present a stochastic scheme for splitting the client data into privatized shares that are transmitted to the server in such settings. The server performs the requested operations on these shares instead of on the raw client data at the server. The obtained intermediate results are sent back to the client where they are assembled by the client to obtain the final result.
We survey distributed deep learning models for training or inference without accessing raw data f... more We survey distributed deep learning models for training or inference without accessing raw data from clients. These methods aim to protect confidential patterns in data while still allowing servers to train models. The distributed deep learning methods of federated learning, split learning and large batch stochastic gradient descent are compared
in addition to private and secure approaches of differential privacy, homomorphic encryption, oblivious transfer and garbled circuits in the context of neural networks. We study their benefits, limitations and trade-offs with regards to computational resources, data leakage and communication efficiency and also share our anticipated future trends.
In this paper we provide a survey of various libraries for homomorphic encryption. We describe ke... more In this paper we provide a survey of various libraries for homomorphic encryption. We describe key features and trade-offs that should be considered while choosing the right approach for secure computation. We then present a comparison of six commonly available Homomorphic Encryption libraries - SEAL, HElib, TFHE, Paillier, ELGamal and RSA across these identified features. Support for different languages and real-life applications are also elucidated.
Can health entities collaboratively train deep learning models without sharing sensitive raw data... more Can health entities collaboratively train deep learning models without sharing sensitive raw data? This paper proposes several configurations of a distributed deep learning method called SplitNN to facilitate such collaborations. SplitNN does not share raw data or model details with collaborating institutions. The proposed configurations of splitNN cater to practical settings of i) entities holding different modalities of patient data, ii) centralized and local health entities collaborating on
multiple tasks and iii) learning without sharing labels. We compare performance and resource efficiency trade-offs of splitNN and other distributed deep learning methods like federated learning, large batch synchronous stochastic gradient descent and show highly encouraging results for splitNN.
In this paper we show that the negative sample distance covariance function is a quasi-concave se... more In this paper we show that the negative sample distance covariance function is a quasi-concave set function of samples of random variables that are not statistically independent. We use these properties to propose greedy algorithms to combinatorially optimize some diversity (low statistical dependence) promoting functions of distance covariance. Our greedy algorithm obtains all the inclusion-minimal maximizers of this diversity promoting objective. Inclusion-minimal maximizers are multiple solution sets of globally optimal maximizers that are not a proper subset of any other maximizing set in the solution set. We present results upon applying this approach to obtain diverse features (covariates/variables/predictors) in a feature selection setting for regression (or classification) problems. We also combine our diverse feature selection algorithm with a distance covariance based relevant feature selection algorithm of [7] to produce subsets of covariates that are both relevant yet ordered in non-increasing levels of diversity of these subsets.
This technical report examines the process by which sensor data from a region suspected to contai... more This technical report examines the process by which sensor data from a
region suspected to contain landmines is used to determine a set of alarm sites, including the manner in which the alarm set provided is scored against competing algorithms. The work and recommendations based on it were developed during the Mathematical Problems in Industry Workshop, held at Duke University June 13-17, 2016.
Electronic Journal of Statistics, 2018
In our work, we propose a novel formulation for supervised dimensionality reduction based on a no... more In our work, we propose a novel formulation for supervised dimensionality reduction based on a nonlinear dependency criterion called Statistical Distance Correlation, [Székely et al., 2007]. We propose an objective which is free of distributional assumptions on regression variables and regression model assumptions. Our proposed formulation is based on learning a low-dimensional feature representation z, which maximizes the squared sum of
Distance Correlations between low dimensional features z and response y, and also between features z and covariates x. We propose a novel algorithm to optimize our proposed objective using the Generalized Minimization Maximization method of Parizi et al. [2015]. We show superior empirical results on multiple datasets proving the effectiveness of our proposed approach over several relevant state-of-the-art supervised dimensionality reduction methods.
In a regression setting we propose algorithms that reduce the dimensionality of the features whil... more In a regression setting we propose algorithms that reduce the dimensionality of the features while simultaneously maximizing a statistical measure of dependence known as distance correlation between the low-dimensional features and a response variable. This helps in solving the prediction problem with a low-dimensional set of features. Our setting is different from subset-selection algorithms where the problem is to choose the best subset of features for regression. Instead, we attempt to generate a new set of low-dimensional features as in a feature-learning setting. We attempt to keep our proposed approach as model-free and our algorithm does not assume the application of any specific regression model in conjunction with the low-dimensional features that it learns. The algorithm is iterative and is fomulated as a combination of the majorization-minimization and concaveconvex optimization procedures. We also present spectral radius based convergence results for the proposed iterations.
In this paper we propose an algorithm for non-linear embedding of affinity tensors obtained by me... more In this paper we propose an algorithm for non-linear embedding of affinity tensors obtained by measuring higher-order similarities between high-dimensional points. We achieve this by preserving the original triadic similarities using another triadic similarity function obtained by sum of squares of diadic similarities in a low-dimension. We show that this formulation reduces to solving for the nonlinear embedding of a graph which has a specific kind of a graph Laplacian. We provide an iterative algorithm for minimizing the loss, and also propose a simple linear-constraint that prevents non-zero solutions for embedding problems unlike the existing variants of quadratic orthonormality constraints used in the literature, that require eigen decompositions to solve for the embedding.
Nonlinear dimensionality reduction techniques of today are highly sensitive to outliers. Almost a... more Nonlinear dimensionality reduction techniques of today are highly sensitive to outliers. Almost all of them are spectral methods and differ from each other over their treatment of the notion of neighborhood similarities computed amongst the high-dimensional input data points. These techniques aim to preserve the notion of this similarity structure in the low-dimensional output. The presence of unwanted outliers in the data directly influences the preservation of these neighborhood similarities amongst the majority of the non-outlier data, as these points ocuring in majority need to simultaneously satisfy their neighborhood similarities they form with the outliers while also satisfying the similarity structure they form with the non-outlier data. This issue disrupts the intrinsic structure of the manifold on which the majority of the non-outlier data lies when preserved via a homeomorphism on a lowdimensional manifold. In this paper we come up with an iterative algorithm that analytically solves for a non-linear embedding with monotonic improvements after each iteration. As an application of this iterative manifold learning algorithm, we come up with a framework that decomposes the pair-wise error observed between all pairs of points and update the neighborhood similarity matrix dynamically to downplay the effect of the outliers, over the majority of the non-outlier data being embedded into a lower dimension.
We present a fast manifold learning algorithm by formulating a new linear constraint that we use ... more We present a fast manifold learning algorithm by formulating a new linear constraint that we use to replace the weighted orthonormality constraints within Laplacian Eigenmaps; a popular manifold learning algorithm. We thereby convert a quadratically constrained quadratic optimization problem into a simpler formulation that is a linearly constrained quadratic optimization problem. We show that solving this problem is equivalent to solving a symmetric diagonally dominant (SDD) linear system which can be solved very fast using a combinatorial multigrid (CMG) solver. In addition to this we also suggest another method that can exploit any sparsity within the graph Laplacian matrix via a fast sparse Cholesky decomposition to produce an alternative solution in addition to the SDD based method. We compare the improvements in run-times using both our SDD system based method as well as our fast sparse Cholesky decomposition based method against the well known Nystrom method based fast manifold learning and present competitive results.
In this work we present A-Wristocracy, a novel framework for recognizing very fine-grained and co... more In this work we present A-Wristocracy, a novel framework for recognizing very fine-grained and complex inhome activities of human users (particularly elderly people) with wrist-worn device sensing. Our designed A-Wristocracy system improves upon the state-of-the-art works on in-home activity recognition using wearables. These works are mostly able to detect coarse-grained ADLs (Activities of Daily Living) but not large number of fine-grained and complex IADLs (Instrumental Activities of Daily Living). These are also not able to distinguish similar activities but with different context (such as sit on floor vs. sit on bed vs. sit on sofa). Our solution helps accurate detection of in-home ADLs/ IADLs and contextual activities, which are all critically important for remote elderly care in tracking their physical and cognitive capabilities. A-Wristocracy makes it feasible to classify large number of fine-grained and complex activities, through Deep Learning based data analytics and exploiting multi-modal sensing on wrist-worn device. It exploits minimal functionality from very light additional infrastructure (through only few Bluetooth beacons), for coarse level location context. A-Wristocracy preserves direct user privacy by excluding camera/ video imaging on wearable or infrastructure. The classification procedure consists of practical feature set extraction from multi-modal wearable sensor suites, followed by Deep Learning based supervised fine-level classification algorithm. We have collected exhaustive home-based ADLs and IADLs data from multiple users. Our designed classifier is validated to be able to recognize very fine-grained complex 22 daily activities (much larger number than 6-12 activities detected by state-ofthe-art works using wearable and no camera/ video) with high average test accuracies of 90% or more for two users in two different home environments.
Drafts by Praneeth Vepakomma
Differential Privacy offers strong guarantees such as immutable privacy under post processing. Th... more Differential Privacy offers strong guarantees such as immutable privacy under post processing. Thus it is often looked to as a solution to learning on scattered and isolated data. This work focuses on supervised manifold learning, a paradigm that can generate fine-tuned manifolds for a target use case. Our contributions are two fold. 1) We present a novel differentially private method PrivateMail for supervised manifold learning, the first of its kind to our knowledge. 2) We provide a novel private geometric embedding scheme for our experimental use case. We experiment on private "content based image retrieval"-embedding and querying the nearest neighbors of images in a private manner and show extensive privacy-utility tradeoff results, as well as the computational efficiency and practicality of our methods.
We propose an improved private count-mean-sketch data structure and show its applicability to dif... more We propose an improved private count-mean-sketch data structure and show its applicability to differentially private contact tracing. Our proposed scheme (Diversified Averaging for Meta estimation of Sketches-DAMS) provides a better trade-off between true positive rates and false positive rates while maintaining differential privacy (a widely accepted formal standard for privacy). We show its relevance to the social good application of private digital contact tracing for COVID-19 and beyond. The scheme involves one way locally differentially private uploads from the infected client devices to a server that upon a post-processing obtains a private ag-gregated histogram of locations traversed by all the infected clients within a time period of interest. The private aggre-gated histogram is then downloaded by any querying client in order to compare it with its own data on-device, to determine whether it has come into close proximity of any infected client or not. We present empirical experiments that show a substantial improvement in performance for this particular application. We also prove theoretical variance-reduction guarantees of the estimates obtained through our scheme and verify these findings via experiments as well.
Performing computations while maintaining privacy is an important problem in todays distributed m... more Performing computations while maintaining privacy is an important problem in todays distributed machine learning solutions. Consider the following two set ups between a client and a server, where in setup i) the client has a public data vector x, the server has a large private database of data vectors B and the client wants to find the inner products x, y k , ∀y k ∈ B. The client does not want the server to learn x while the server does not want the client to learn the records in its database. This is in contrast to another setup ii) where the client would like to perform an operation solely on its data, such as computation of a matrix inverse on its data matrix M, but would like to use the superior computing ability of the server to do so without having to leak M to the server. We present a stochastic scheme for splitting the client data into privatized shares that are transmitted to the server in such settings. The server performs the requested operations on these shares instead of on the raw client data at the server. The obtained intermediate results are sent back to the client where they are assembled by the client to obtain the final result.
We survey distributed deep learning models for training or inference without accessing raw data f... more We survey distributed deep learning models for training or inference without accessing raw data from clients. These methods aim to protect confidential patterns in data while still allowing servers to train models. The distributed deep learning methods of federated learning, split learning and large batch stochastic gradient descent are compared
in addition to private and secure approaches of differential privacy, homomorphic encryption, oblivious transfer and garbled circuits in the context of neural networks. We study their benefits, limitations and trade-offs with regards to computational resources, data leakage and communication efficiency and also share our anticipated future trends.
In this paper we provide a survey of various libraries for homomorphic encryption. We describe ke... more In this paper we provide a survey of various libraries for homomorphic encryption. We describe key features and trade-offs that should be considered while choosing the right approach for secure computation. We then present a comparison of six commonly available Homomorphic Encryption libraries - SEAL, HElib, TFHE, Paillier, ELGamal and RSA across these identified features. Support for different languages and real-life applications are also elucidated.
Can health entities collaboratively train deep learning models without sharing sensitive raw data... more Can health entities collaboratively train deep learning models without sharing sensitive raw data? This paper proposes several configurations of a distributed deep learning method called SplitNN to facilitate such collaborations. SplitNN does not share raw data or model details with collaborating institutions. The proposed configurations of splitNN cater to practical settings of i) entities holding different modalities of patient data, ii) centralized and local health entities collaborating on
multiple tasks and iii) learning without sharing labels. We compare performance and resource efficiency trade-offs of splitNN and other distributed deep learning methods like federated learning, large batch synchronous stochastic gradient descent and show highly encouraging results for splitNN.
In this paper we show that the negative sample distance covariance function is a quasi-concave se... more In this paper we show that the negative sample distance covariance function is a quasi-concave set function of samples of random variables that are not statistically independent. We use these properties to propose greedy algorithms to combinatorially optimize some diversity (low statistical dependence) promoting functions of distance covariance. Our greedy algorithm obtains all the inclusion-minimal maximizers of this diversity promoting objective. Inclusion-minimal maximizers are multiple solution sets of globally optimal maximizers that are not a proper subset of any other maximizing set in the solution set. We present results upon applying this approach to obtain diverse features (covariates/variables/predictors) in a feature selection setting for regression (or classification) problems. We also combine our diverse feature selection algorithm with a distance covariance based relevant feature selection algorithm of [7] to produce subsets of covariates that are both relevant yet ordered in non-increasing levels of diversity of these subsets.
This technical report examines the process by which sensor data from a region suspected to contai... more This technical report examines the process by which sensor data from a
region suspected to contain landmines is used to determine a set of alarm sites, including the manner in which the alarm set provided is scored against competing algorithms. The work and recommendations based on it were developed during the Mathematical Problems in Industry Workshop, held at Duke University June 13-17, 2016.
Electronic Journal of Statistics, 2018
In our work, we propose a novel formulation for supervised dimensionality reduction based on a no... more In our work, we propose a novel formulation for supervised dimensionality reduction based on a nonlinear dependency criterion called Statistical Distance Correlation, [Székely et al., 2007]. We propose an objective which is free of distributional assumptions on regression variables and regression model assumptions. Our proposed formulation is based on learning a low-dimensional feature representation z, which maximizes the squared sum of
Distance Correlations between low dimensional features z and response y, and also between features z and covariates x. We propose a novel algorithm to optimize our proposed objective using the Generalized Minimization Maximization method of Parizi et al. [2015]. We show superior empirical results on multiple datasets proving the effectiveness of our proposed approach over several relevant state-of-the-art supervised dimensionality reduction methods.
In a regression setting we propose algorithms that reduce the dimensionality of the features whil... more In a regression setting we propose algorithms that reduce the dimensionality of the features while simultaneously maximizing a statistical measure of dependence known as distance correlation between the low-dimensional features and a response variable. This helps in solving the prediction problem with a low-dimensional set of features. Our setting is different from subset-selection algorithms where the problem is to choose the best subset of features for regression. Instead, we attempt to generate a new set of low-dimensional features as in a feature-learning setting. We attempt to keep our proposed approach as model-free and our algorithm does not assume the application of any specific regression model in conjunction with the low-dimensional features that it learns. The algorithm is iterative and is fomulated as a combination of the majorization-minimization and concaveconvex optimization procedures. We also present spectral radius based convergence results for the proposed iterations.
In this paper we propose an algorithm for non-linear embedding of affinity tensors obtained by me... more In this paper we propose an algorithm for non-linear embedding of affinity tensors obtained by measuring higher-order similarities between high-dimensional points. We achieve this by preserving the original triadic similarities using another triadic similarity function obtained by sum of squares of diadic similarities in a low-dimension. We show that this formulation reduces to solving for the nonlinear embedding of a graph which has a specific kind of a graph Laplacian. We provide an iterative algorithm for minimizing the loss, and also propose a simple linear-constraint that prevents non-zero solutions for embedding problems unlike the existing variants of quadratic orthonormality constraints used in the literature, that require eigen decompositions to solve for the embedding.
Nonlinear dimensionality reduction techniques of today are highly sensitive to outliers. Almost a... more Nonlinear dimensionality reduction techniques of today are highly sensitive to outliers. Almost all of them are spectral methods and differ from each other over their treatment of the notion of neighborhood similarities computed amongst the high-dimensional input data points. These techniques aim to preserve the notion of this similarity structure in the low-dimensional output. The presence of unwanted outliers in the data directly influences the preservation of these neighborhood similarities amongst the majority of the non-outlier data, as these points ocuring in majority need to simultaneously satisfy their neighborhood similarities they form with the outliers while also satisfying the similarity structure they form with the non-outlier data. This issue disrupts the intrinsic structure of the manifold on which the majority of the non-outlier data lies when preserved via a homeomorphism on a lowdimensional manifold. In this paper we come up with an iterative algorithm that analytically solves for a non-linear embedding with monotonic improvements after each iteration. As an application of this iterative manifold learning algorithm, we come up with a framework that decomposes the pair-wise error observed between all pairs of points and update the neighborhood similarity matrix dynamically to downplay the effect of the outliers, over the majority of the non-outlier data being embedded into a lower dimension.
We present a fast manifold learning algorithm by formulating a new linear constraint that we use ... more We present a fast manifold learning algorithm by formulating a new linear constraint that we use to replace the weighted orthonormality constraints within Laplacian Eigenmaps; a popular manifold learning algorithm. We thereby convert a quadratically constrained quadratic optimization problem into a simpler formulation that is a linearly constrained quadratic optimization problem. We show that solving this problem is equivalent to solving a symmetric diagonally dominant (SDD) linear system which can be solved very fast using a combinatorial multigrid (CMG) solver. In addition to this we also suggest another method that can exploit any sparsity within the graph Laplacian matrix via a fast sparse Cholesky decomposition to produce an alternative solution in addition to the SDD based method. We compare the improvements in run-times using both our SDD system based method as well as our fast sparse Cholesky decomposition based method against the well known Nystrom method based fast manifold learning and present competitive results.
In this work we present A-Wristocracy, a novel framework for recognizing very fine-grained and co... more In this work we present A-Wristocracy, a novel framework for recognizing very fine-grained and complex inhome activities of human users (particularly elderly people) with wrist-worn device sensing. Our designed A-Wristocracy system improves upon the state-of-the-art works on in-home activity recognition using wearables. These works are mostly able to detect coarse-grained ADLs (Activities of Daily Living) but not large number of fine-grained and complex IADLs (Instrumental Activities of Daily Living). These are also not able to distinguish similar activities but with different context (such as sit on floor vs. sit on bed vs. sit on sofa). Our solution helps accurate detection of in-home ADLs/ IADLs and contextual activities, which are all critically important for remote elderly care in tracking their physical and cognitive capabilities. A-Wristocracy makes it feasible to classify large number of fine-grained and complex activities, through Deep Learning based data analytics and exploiting multi-modal sensing on wrist-worn device. It exploits minimal functionality from very light additional infrastructure (through only few Bluetooth beacons), for coarse level location context. A-Wristocracy preserves direct user privacy by excluding camera/ video imaging on wearable or infrastructure. The classification procedure consists of practical feature set extraction from multi-modal wearable sensor suites, followed by Deep Learning based supervised fine-level classification algorithm. We have collected exhaustive home-based ADLs and IADLs data from multiple users. Our designed classifier is validated to be able to recognize very fine-grained complex 22 daily activities (much larger number than 6-12 activities detected by state-ofthe-art works using wearable and no camera/ video) with high average test accuracies of 90% or more for two users in two different home environments.
Differential Privacy offers strong guarantees such as immutable privacy under post processing. Th... more Differential Privacy offers strong guarantees such as immutable privacy under post processing. Thus it is often looked to as a solution to learning on scattered and isolated data. This work focuses on supervised manifold learning, a paradigm that can generate fine-tuned manifolds for a target use case. Our contributions are two fold. 1) We present a novel differentially private method PrivateMail for supervised manifold learning, the first of its kind to our knowledge. 2) We provide a novel private geometric embedding scheme for our experimental use case. We experiment on private "content based image retrieval"-embedding and querying the nearest neighbors of images in a private manner and show extensive privacy-utility tradeoff results, as well as the computational efficiency and practicality of our methods.
We propose an improved private count-mean-sketch data structure and show its applicability to dif... more We propose an improved private count-mean-sketch data structure and show its applicability to differentially private contact tracing. Our proposed scheme (Diversified Averaging for Meta estimation of Sketches-DAMS) provides a better trade-off between true positive rates and false positive rates while maintaining differential privacy (a widely accepted formal standard for privacy). We show its relevance to the social good application of private digital contact tracing for COVID-19 and beyond. The scheme involves one way locally differentially private uploads from the infected client devices to a server that upon a post-processing obtains a private ag-gregated histogram of locations traversed by all the infected clients within a time period of interest. The private aggre-gated histogram is then downloaded by any querying client in order to compare it with its own data on-device, to determine whether it has come into close proximity of any infected client or not. We present empirical experiments that show a substantial improvement in performance for this particular application. We also prove theoretical variance-reduction guarantees of the estimates obtained through our scheme and verify these findings via experiments as well.