Haroon Raja - Academia.edu (original) (raw)

Papers by Haroon Raja

Research paper thumbnail of Tensor Regression Using Low-Rank and Sparse Tucker Decompositions

SIAM Journal on Mathematics of Data Science, 2020

Research paper thumbnail of Cloud K-SVD: A Collaborative Dictionary Learning Algorithm for Big, Distributed Data

This paper studies the problem of data-adaptive representations for big, distributed data. It is ... more This paper studies the problem of data-adaptive representations for big, distributed data. It is assumed that a number of geographically-distributed, interconnected sites have massive local data and they are interested in collaboratively learning a low-dimensional geometric structure underlying these data. In contrast to previous works on subspace-based data representations, this paper focuses on the geometric structure of a union of subspaces (UoS). In this regard, it proposes a distributed algorithm---termed cloud K-SVD---for collaborative learning of a UoS structure underlying distributed data of interest. The goal of cloud K-SVD is to learn a common overcomplete dictionary at each individual site such that every sample in the distributed data can be represented through a small number of atoms of the learned dictionary. Cloud K-SVD accomplishes this goal without requiring exchange of individual samples between sites. This makes it suitable for applications where sharing of raw da...

Research paper thumbnail of Through-the-wall radar imaging using a distributed Quasi-Newton method

2017 51st Asilomar Conference on Signals, Systems, and Computers, 2017

Research paper thumbnail of Recent developments in distributed dictionary learning

2017 51st Annual Conference on Information Sciences and Systems (CISS), 2017

Most of the research on dictionary learning has focused on developing algorithms under the assump... more Most of the research on dictionary learning has focused on developing algorithms under the assumption that data is available at a centralized location. But often the data is not available at a centralized location due to practical constraints like data aggregation costs, privacy concerns, etc. Using centralized dictionary learning algorithms may not be the optimal choice in such settings. This motivates the design of dictionary learning algorithms that consider distributed nature of data as one of the problem variables. Just like centralized settings, distributed dictionary learning problem can be posed in more than one way depending on the problem setup. Most notable distinguishing features are the online versus batch nature of data and the representative versus discriminative nature of the dictionaries. In this paper, several distributed dictionary learning algorithms that are designed to tackle different problem setups are reviewed. One of these algorithms is cloud K-SVD, which s...

Research paper thumbnail of Scaling-Up Distributed Processing of Data Streams for Machine Learning

Proceedings of the IEEE, 2020

Emerging applications of machine learning in numerous areas-including online social networks, rem... more Emerging applications of machine learning in numerous areas-including online social networks, remote sensing, internet-of-things systems, smart grids, and more-involve continuous gathering of and learning from streams of data samples. Real-time incorporation of streaming data into the learned machine learning models is essential for improved inference in these applications. Further, these applications often involve data that are either inherently gathered at geographically distributed entities due to physical reasons-e.g., internet-of-things systems and smart grids-or that are intentionally distributed across multiple computing machines for memory, storage, computational, and/or privacy reasons. Training of machine learning models in this distributed, streaming setting requires solving stochastic optimization problems in a collaborative manner over communication links between the physical entities. When the streaming data rate is high compared to the processing capabilities of individual computing entities and/or the rate of the communications links, this poses a challenging question: how can one best leverage the incoming data for distributed training of machine learning models under constraints on computing capabilities and/or communications rate? A large body of research in distributed online optimization has emerged in recent decades to tackle this and related problems. This paper reviews recently developed methods that focus on large-scale distributed stochastic optimization in the compute-and bandwidth-limited regime, with an emphasis on convergence analysis that explicitly accounts for the mismatch between computation, communication and streaming rates, and that provides sufficient conditions for order-optimum convergence. In particular, it focuses on methods that solve: (i) distributed stochastic convex problems, and (ii) distributed principal component analysis, which is a nonconvex problem with geometric structure that permits global convergence. For such methods, the paper discusses recent advances in terms of distributed algorithmic designs when faced with high-rate streaming data. Further, it reviews theoretical guarantees underlying these methods, which show there exist regimes in which systems can learn from distributed processing of streaming data at order-optimal rates-nearly as fast as if all the data were processed at a single super-powerful machine.

Research paper thumbnail of Distributed Stochastic Algorithms for High-rate Streaming Principal Component Analysis

This paper considers the problem of estimating the principal eigenvector of a covariance matrix f... more This paper considers the problem of estimating the principal eigenvector of a covariance matrix from independent and identically distributed data samples in streaming settings. The streaming rate of data in many contemporary applications can be high enough that a single processor cannot finish an iteration of existing methods for eigenvector estimation before a new sample arrives. This paper formulates and analyzes a distributed variant of the classical Krasulina's method (D-Krasulina) that can keep up with the high streaming rate of data by distributing the computational load across multiple processing nodes. The analysis shows that---under appropriate conditions---D-Krasulina converges to the principal eigenvector in an order-wise optimal manner; i.e., after receiving MMM samples across all nodes, its estimation error can be O(1/M)O(1/M)O(1/M). In order to reduce the network communication overhead, the paper also develops and analyzes a mini-batch extension of D-Krasulina, which is term...

Research paper thumbnail of Fast and Communication-efficient Distributed Pca

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019

This paper focuses on principal components analysis (PCA), which involves estimating the principa... more This paper focuses on principal components analysis (PCA), which involves estimating the principal subspace of a data covariance matrix, in the age of big data. Massively large datasets often require storage across multiple machines, which precludes the use of cen-tralized PCA solutions. While a number of distributed solutions to the PCA problem have been proposed recently, convergence guarantees and/or communications overhead of these solutions remain a concern. With an eye towards communications efficiency, this paper introduces two variants of a distributed PCA algorithm termed distributed Sanger’s algorithm (DSA). Principal subspace estimation using both variants of DSA is communication efficient because of its one time-scale nature. In addition, theoretical guarantees are provided for the asymptotic convergence of basic DSA to the principal subspace, while its "accelerated" variant is numerically shown to have faster convergence than the state-of-the-art.

Research paper thumbnail of Detecting national political unrest on Twitter

2016 IEEE International Conference on Communications (ICC), 2016

Research paper thumbnail of Parametric dictionary learning for TWRI using distributed particle swarm optimization

2016 IEEE Radar Conference (RadarConf), 2016

This paper considers a distributed network of through-the-wall radars for accurate indoor scene r... more This paper considers a distributed network of through-the-wall radars for accurate indoor scene reconstruction in the presence of multipath propagation. A sparsity based method is proposed for eliminating ghost targets under imperfect knowledge of interior wall locations. Instead of aggregating and processing the observations at a central fusion station, joint scene reconstruction and estimation of interior wall locations is carried out in a distributed manner across the network. More specifically, an alternating minimization approach is utilized to solve the associated non-convex optimization problem, wherein the sparse scene is reconstructed using the recently proposed modified distributed orthogonal matching pursuit algorithm while the wall location estimates are obtained with a novel distributed particle swarm optimization algorithm (D-PSO) proposed in this paper. Existing literature on averaging consensus is leveraged to derive the D-PSO algorithm. The efficacy of proposed approach is demonstrated using numerical simulation.

Research paper thumbnail of Learning overcomplete representations from distributed data: a brief review

Compressive Sensing V: From Diverse Modalities to Big Data Analytics, 2016

Most of the research on dictionary learning has focused on developing algorithms under the assump... more Most of the research on dictionary learning has focused on developing algorithms under the assumption that data is available at a centralized location. But often the data is not available at a centralized location due to practical constraints like data aggregation costs, privacy concerns, etc. Using centralized dictionary learning algorithms may not be the optimal choice in such settings. This motivates the design of dictionary learning algorithms that consider distributed nature of data as one of the problem variables. Just like centralized settings, distributed dictionary learning problem can be posed in more than one way depending on the problem setup. Most notable distinguishing features are the online versus batch nature of data and the representative versus discriminative nature of the dictionaries. In this paper, several distributed dictionary learning algorithms that are designed to tackle different problem setups are reviewed. One of these algorithms is cloud K-SVD, which solves the dictionary learning problem for batch data in distributed settings. One distinguishing feature of cloud K-SVD is that it has been shown to converge to its centralized counterpart, namely, the K-SVD solution. On the other hand, no such guarantees are provided for other distributed dictionary learning algorithms. Convergence of cloud K-SVD to the centralized K-SVD solution means problems that are solvable by K-SVD in centralized settings can now be solved in distributed settings with similar performance. Finally, cloud K-SVD is used as an example to show the advantages that are attainable by deploying distributed dictionary algorithms for real world distributed datasets.

Research paper thumbnail of Rate-distortion optimized transcoder selection for multimedia transmission in heterogeneous networks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012

ABSTRACT In this paper we propose a solution for selection of appropriate transcoding nodes in a ... more ABSTRACT In this paper we propose a solution for selection of appropriate transcoding nodes in a network operating in ad-hoc mode. The heterogeneity present in today's networked devices necessitates different quality of video for different end users. One possible solution for this heterogeneity is to transcode the video stream as per user demand. In this work, we define significant parameters to facilitate decision on selection of transcoding nodes within a wireless access network. We formulate the problem as a rate-distortion optimization to achieve conflicting objectives of high quality and minimum time of delivery to an end user. Unlike past works which have focused on transcoding to develop efficient distributed transcoders, our aim is to come up with methods for placement of these parallel transcoding nodes in a heterogeneous network, keeping in view the constraints of timely delivery of video and minimal distortion.

Research paper thumbnail of Throughput enhancement by cross-layer header compression in WLANs

2010 16th Asia-Pacific Conference on Communications (APCC), 2010

Abstract A major limiting factor in increasing the throughput of wireless networks has been the b... more Abstract A major limiting factor in increasing the throughput of wireless networks has been the bottleneck of prohibitive signaling overhead. A number of header compression schemes have been proposed to solve this particular problem. These, however, come with their set ...

Research paper thumbnail of A convergence analysis of distributed dictionary learning based on the K-SVD algorithm

2015 IEEE International Symposium on Information Theory (ISIT), 2015

Research paper thumbnail of Cloud K-SVD: A Collaborative Dictionary Learning Algorithm for Big, Distributed Data

IEEE Transactions on Signal Processing, 2015

Research paper thumbnail of Dictionary learning based nonlinear classifier training from distributed data

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2014

Research paper thumbnail of Cloud K-SVD: Computing data-adaptive representations in the cloud

2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2013

Research paper thumbnail of Performance Analysis of WiMAX Best Effort and ertPS Service Classes for Video Transmission

Lecture Notes in Computer Science, 2012

ABSTRACT To support different types of data like http, real-time audio and video, VoIP, FTP, ther... more ABSTRACT To support different types of data like http, real-time audio and video, VoIP, FTP, there are various classes in WiMax system. In this work, we try to analyze the performance when multimedia contents are transmitted over WiMax network. Due to stringent delay requirement of real-time multimedia data, a separate class is allocated for it. i.e. rtPS. Thus our objective is to find out that how much we gain advantage by transmitting multimedia over this separate class? This requires a thorough analysis while considering all the scenarios. Our contribution in this paper is to build an initial framework for answering the above stated questions. The Network Simulator (ns-2) which is a popular tool for the simulation of computer networks has been used to simulate the results. Standard-compliant implementations have been used to authenticate the results.

Research paper thumbnail of Detecting National Political Unrest on Twitter

The popular uprisings in a number of countries in the Middle East and North Africa in the Spring ... more The popular uprisings in a number of countries in the Middle East and North Africa in the Spring of 2011 were broadcasted live and enabled by local populations' access to social networking services such as Twitter and Facebook. The goal of this paper is to study the flow characteristics of the information flow of these broadcasts on Twitter. We have used language independent features of Twitter traffic to identify differences in information flows on Twitter mentioning countries experiencing some form of unrest, compared to traffic mentioning countries with peaceful political situations. We used these features to identify countries with political unstable situation. For empirical analysis, we collected several data sets of countries that were experiencing political unrest, as well as a set of countries in a control group that were not subject to such socio-political condition. Several different methods are used to model the flow of information between Twitter users in data sets as graphs, called information cascades. By using the dynamic properties of information cascades, na¨ıve Bayes and SVM classifiers both achieve true positives rates of 100%, with false positives rates of 3% and 0%, respectively.

Research paper thumbnail of Tensor Regression Using Low-Rank and Sparse Tucker Decompositions

SIAM Journal on Mathematics of Data Science, 2020

Research paper thumbnail of Cloud K-SVD: A Collaborative Dictionary Learning Algorithm for Big, Distributed Data

This paper studies the problem of data-adaptive representations for big, distributed data. It is ... more This paper studies the problem of data-adaptive representations for big, distributed data. It is assumed that a number of geographically-distributed, interconnected sites have massive local data and they are interested in collaboratively learning a low-dimensional geometric structure underlying these data. In contrast to previous works on subspace-based data representations, this paper focuses on the geometric structure of a union of subspaces (UoS). In this regard, it proposes a distributed algorithm---termed cloud K-SVD---for collaborative learning of a UoS structure underlying distributed data of interest. The goal of cloud K-SVD is to learn a common overcomplete dictionary at each individual site such that every sample in the distributed data can be represented through a small number of atoms of the learned dictionary. Cloud K-SVD accomplishes this goal without requiring exchange of individual samples between sites. This makes it suitable for applications where sharing of raw da...

Research paper thumbnail of Through-the-wall radar imaging using a distributed Quasi-Newton method

2017 51st Asilomar Conference on Signals, Systems, and Computers, 2017

Research paper thumbnail of Recent developments in distributed dictionary learning

2017 51st Annual Conference on Information Sciences and Systems (CISS), 2017

Most of the research on dictionary learning has focused on developing algorithms under the assump... more Most of the research on dictionary learning has focused on developing algorithms under the assumption that data is available at a centralized location. But often the data is not available at a centralized location due to practical constraints like data aggregation costs, privacy concerns, etc. Using centralized dictionary learning algorithms may not be the optimal choice in such settings. This motivates the design of dictionary learning algorithms that consider distributed nature of data as one of the problem variables. Just like centralized settings, distributed dictionary learning problem can be posed in more than one way depending on the problem setup. Most notable distinguishing features are the online versus batch nature of data and the representative versus discriminative nature of the dictionaries. In this paper, several distributed dictionary learning algorithms that are designed to tackle different problem setups are reviewed. One of these algorithms is cloud K-SVD, which s...

Research paper thumbnail of Scaling-Up Distributed Processing of Data Streams for Machine Learning

Proceedings of the IEEE, 2020

Emerging applications of machine learning in numerous areas-including online social networks, rem... more Emerging applications of machine learning in numerous areas-including online social networks, remote sensing, internet-of-things systems, smart grids, and more-involve continuous gathering of and learning from streams of data samples. Real-time incorporation of streaming data into the learned machine learning models is essential for improved inference in these applications. Further, these applications often involve data that are either inherently gathered at geographically distributed entities due to physical reasons-e.g., internet-of-things systems and smart grids-or that are intentionally distributed across multiple computing machines for memory, storage, computational, and/or privacy reasons. Training of machine learning models in this distributed, streaming setting requires solving stochastic optimization problems in a collaborative manner over communication links between the physical entities. When the streaming data rate is high compared to the processing capabilities of individual computing entities and/or the rate of the communications links, this poses a challenging question: how can one best leverage the incoming data for distributed training of machine learning models under constraints on computing capabilities and/or communications rate? A large body of research in distributed online optimization has emerged in recent decades to tackle this and related problems. This paper reviews recently developed methods that focus on large-scale distributed stochastic optimization in the compute-and bandwidth-limited regime, with an emphasis on convergence analysis that explicitly accounts for the mismatch between computation, communication and streaming rates, and that provides sufficient conditions for order-optimum convergence. In particular, it focuses on methods that solve: (i) distributed stochastic convex problems, and (ii) distributed principal component analysis, which is a nonconvex problem with geometric structure that permits global convergence. For such methods, the paper discusses recent advances in terms of distributed algorithmic designs when faced with high-rate streaming data. Further, it reviews theoretical guarantees underlying these methods, which show there exist regimes in which systems can learn from distributed processing of streaming data at order-optimal rates-nearly as fast as if all the data were processed at a single super-powerful machine.

Research paper thumbnail of Distributed Stochastic Algorithms for High-rate Streaming Principal Component Analysis

This paper considers the problem of estimating the principal eigenvector of a covariance matrix f... more This paper considers the problem of estimating the principal eigenvector of a covariance matrix from independent and identically distributed data samples in streaming settings. The streaming rate of data in many contemporary applications can be high enough that a single processor cannot finish an iteration of existing methods for eigenvector estimation before a new sample arrives. This paper formulates and analyzes a distributed variant of the classical Krasulina's method (D-Krasulina) that can keep up with the high streaming rate of data by distributing the computational load across multiple processing nodes. The analysis shows that---under appropriate conditions---D-Krasulina converges to the principal eigenvector in an order-wise optimal manner; i.e., after receiving MMM samples across all nodes, its estimation error can be O(1/M)O(1/M)O(1/M). In order to reduce the network communication overhead, the paper also develops and analyzes a mini-batch extension of D-Krasulina, which is term...

Research paper thumbnail of Fast and Communication-efficient Distributed Pca

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019

This paper focuses on principal components analysis (PCA), which involves estimating the principa... more This paper focuses on principal components analysis (PCA), which involves estimating the principal subspace of a data covariance matrix, in the age of big data. Massively large datasets often require storage across multiple machines, which precludes the use of cen-tralized PCA solutions. While a number of distributed solutions to the PCA problem have been proposed recently, convergence guarantees and/or communications overhead of these solutions remain a concern. With an eye towards communications efficiency, this paper introduces two variants of a distributed PCA algorithm termed distributed Sanger’s algorithm (DSA). Principal subspace estimation using both variants of DSA is communication efficient because of its one time-scale nature. In addition, theoretical guarantees are provided for the asymptotic convergence of basic DSA to the principal subspace, while its "accelerated" variant is numerically shown to have faster convergence than the state-of-the-art.

Research paper thumbnail of Detecting national political unrest on Twitter

2016 IEEE International Conference on Communications (ICC), 2016

Research paper thumbnail of Parametric dictionary learning for TWRI using distributed particle swarm optimization

2016 IEEE Radar Conference (RadarConf), 2016

This paper considers a distributed network of through-the-wall radars for accurate indoor scene r... more This paper considers a distributed network of through-the-wall radars for accurate indoor scene reconstruction in the presence of multipath propagation. A sparsity based method is proposed for eliminating ghost targets under imperfect knowledge of interior wall locations. Instead of aggregating and processing the observations at a central fusion station, joint scene reconstruction and estimation of interior wall locations is carried out in a distributed manner across the network. More specifically, an alternating minimization approach is utilized to solve the associated non-convex optimization problem, wherein the sparse scene is reconstructed using the recently proposed modified distributed orthogonal matching pursuit algorithm while the wall location estimates are obtained with a novel distributed particle swarm optimization algorithm (D-PSO) proposed in this paper. Existing literature on averaging consensus is leveraged to derive the D-PSO algorithm. The efficacy of proposed approach is demonstrated using numerical simulation.

Research paper thumbnail of Learning overcomplete representations from distributed data: a brief review

Compressive Sensing V: From Diverse Modalities to Big Data Analytics, 2016

Most of the research on dictionary learning has focused on developing algorithms under the assump... more Most of the research on dictionary learning has focused on developing algorithms under the assumption that data is available at a centralized location. But often the data is not available at a centralized location due to practical constraints like data aggregation costs, privacy concerns, etc. Using centralized dictionary learning algorithms may not be the optimal choice in such settings. This motivates the design of dictionary learning algorithms that consider distributed nature of data as one of the problem variables. Just like centralized settings, distributed dictionary learning problem can be posed in more than one way depending on the problem setup. Most notable distinguishing features are the online versus batch nature of data and the representative versus discriminative nature of the dictionaries. In this paper, several distributed dictionary learning algorithms that are designed to tackle different problem setups are reviewed. One of these algorithms is cloud K-SVD, which solves the dictionary learning problem for batch data in distributed settings. One distinguishing feature of cloud K-SVD is that it has been shown to converge to its centralized counterpart, namely, the K-SVD solution. On the other hand, no such guarantees are provided for other distributed dictionary learning algorithms. Convergence of cloud K-SVD to the centralized K-SVD solution means problems that are solvable by K-SVD in centralized settings can now be solved in distributed settings with similar performance. Finally, cloud K-SVD is used as an example to show the advantages that are attainable by deploying distributed dictionary algorithms for real world distributed datasets.

Research paper thumbnail of Rate-distortion optimized transcoder selection for multimedia transmission in heterogeneous networks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012

ABSTRACT In this paper we propose a solution for selection of appropriate transcoding nodes in a ... more ABSTRACT In this paper we propose a solution for selection of appropriate transcoding nodes in a network operating in ad-hoc mode. The heterogeneity present in today's networked devices necessitates different quality of video for different end users. One possible solution for this heterogeneity is to transcode the video stream as per user demand. In this work, we define significant parameters to facilitate decision on selection of transcoding nodes within a wireless access network. We formulate the problem as a rate-distortion optimization to achieve conflicting objectives of high quality and minimum time of delivery to an end user. Unlike past works which have focused on transcoding to develop efficient distributed transcoders, our aim is to come up with methods for placement of these parallel transcoding nodes in a heterogeneous network, keeping in view the constraints of timely delivery of video and minimal distortion.

Research paper thumbnail of Throughput enhancement by cross-layer header compression in WLANs

2010 16th Asia-Pacific Conference on Communications (APCC), 2010

Abstract A major limiting factor in increasing the throughput of wireless networks has been the b... more Abstract A major limiting factor in increasing the throughput of wireless networks has been the bottleneck of prohibitive signaling overhead. A number of header compression schemes have been proposed to solve this particular problem. These, however, come with their set ...

Research paper thumbnail of A convergence analysis of distributed dictionary learning based on the K-SVD algorithm

2015 IEEE International Symposium on Information Theory (ISIT), 2015

Research paper thumbnail of Cloud K-SVD: A Collaborative Dictionary Learning Algorithm for Big, Distributed Data

IEEE Transactions on Signal Processing, 2015

Research paper thumbnail of Dictionary learning based nonlinear classifier training from distributed data

2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2014

Research paper thumbnail of Cloud K-SVD: Computing data-adaptive representations in the cloud

2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2013

Research paper thumbnail of Performance Analysis of WiMAX Best Effort and ertPS Service Classes for Video Transmission

Lecture Notes in Computer Science, 2012

ABSTRACT To support different types of data like http, real-time audio and video, VoIP, FTP, ther... more ABSTRACT To support different types of data like http, real-time audio and video, VoIP, FTP, there are various classes in WiMax system. In this work, we try to analyze the performance when multimedia contents are transmitted over WiMax network. Due to stringent delay requirement of real-time multimedia data, a separate class is allocated for it. i.e. rtPS. Thus our objective is to find out that how much we gain advantage by transmitting multimedia over this separate class? This requires a thorough analysis while considering all the scenarios. Our contribution in this paper is to build an initial framework for answering the above stated questions. The Network Simulator (ns-2) which is a popular tool for the simulation of computer networks has been used to simulate the results. Standard-compliant implementations have been used to authenticate the results.

Research paper thumbnail of Detecting National Political Unrest on Twitter

The popular uprisings in a number of countries in the Middle East and North Africa in the Spring ... more The popular uprisings in a number of countries in the Middle East and North Africa in the Spring of 2011 were broadcasted live and enabled by local populations' access to social networking services such as Twitter and Facebook. The goal of this paper is to study the flow characteristics of the information flow of these broadcasts on Twitter. We have used language independent features of Twitter traffic to identify differences in information flows on Twitter mentioning countries experiencing some form of unrest, compared to traffic mentioning countries with peaceful political situations. We used these features to identify countries with political unstable situation. For empirical analysis, we collected several data sets of countries that were experiencing political unrest, as well as a set of countries in a control group that were not subject to such socio-political condition. Several different methods are used to model the flow of information between Twitter users in data sets as graphs, called information cascades. By using the dynamic properties of information cascades, na¨ıve Bayes and SVM classifiers both achieve true positives rates of 100%, with false positives rates of 3% and 0%, respectively.