Ramamohanarao Kotagiri - Academia.edu (original) (raw)
Papers by Ramamohanarao Kotagiri
Journal of Biomedical Informatics, 2017
IEEE Conference Proceedings, 2016
Lecture Notes in Computer Science, Jul 15, 2009
Providing fault tolerance for message passing parallel application on a distributed environment i... more Providing fault tolerance for message passing parallel application on a distributed environment is a rule rather than an exception. A node failure can cause the whole computation to stop and has to be restarted from the beginning if no fault tolerance is available. However, introducing fault tolerance has some overhead on speedup that can be achieved. In this paper, we introduce a new technique called replication with cross-over packets for reliability and to increase fault tolerance over Very Large Scale Grids (VLSG). This technique has two pronged effect of avoiding single point of failure and single link of failure. We incorporate this new technique into the L-BSP model and show the possible speedup of parallel process. We also derive the achievable speedup for some fundamental parallel algorithms using this technique.
Proceedings of the ... AAAI Conference on Artificial Intelligence, Feb 13, 2017
The k-fold cross-validation is commonly used to evaluate the effectiveness of SVMs with the selec... more The k-fold cross-validation is commonly used to evaluate the effectiveness of SVMs with the selected hyper-parameters. It is known that the SVM k-fold cross-validation is expensive, since it requires training k SVMs. However, little work has explored reusing the h th SVM for training the (h + 1) th SVM for improving the efficiency of k-fold cross-validation. In this paper, we propose three algorithms that reuse the h th SVM for improving the efficiency of training the (h + 1) th SVM. Our key idea is to efficiently identify the support vectors and to accurately estimate their associated weights (also called alpha values) of the next SVM by using the previous SVM. Our experimental results show that our algorithms are several times faster than the k-fold cross-validation which does not make use of the previously trained SVM. Moreover, our algorithms produce the same results (hence same accuracy) as the k-fold cross-validation which does not make use of the previously trained SVM.
arXiv (Cornell University), Jun 29, 2011
A large spectrum of applications such as location based services and environmental monitoring dem... more A large spectrum of applications such as location based services and environmental monitoring demand efficient query processing on uncertain databases. In this paper, we propose the probabilistic Voronoi diagram (PVD) for processing moving nearest neighbor queries on uncertain data, namely the probabilistic moving nearest neighbor (PMNN) queries. A PMNN query finds the most probable nearest neighbor of a moving query point continuously. To process PMNN queries efficiently, we provide two techniques: a pre-computation approach and an incremental approach. In the pre-computation approach, we develop an algorithm to efficiently evaluate PMNN queries based on the precomputed PVD for the entire data set. In the incremental approach, we propose an incremental probabilistic safe region based technique that does not require to pre-compute the whole PVD to answer the PMNN query. In this incremental approach, we exploit the knowledge for a known region to compute the lower bound of the probability of an object being the nearest neighbor. Experimental results show that our approaches significantly outperform a sampling based approach by orders of magnitude in terms of I/O, query processing time, and communication overheads.
IEEE Conference Proceedings, 2020
IEEE Transactions on Knowledge and Data Engineering, 2019
Recently, multi-view features have significantly promoted the performance of image re-ranking by ... more Recently, multi-view features have significantly promoted the performance of image re-ranking by providing complementary image descriptions. Without loss of generality, in multi-view re-ranking, multiple heterogeneous visual features of high dimensionality are projected onto a low-dimensional subspace, and thus the resulting latent representation can be used for the subsequent similarity-based ranking. Albeit effective, this standard mechanism underplays the intrinsic structure underlying the latent subspace and does not take into account the substantial noise in the original multi-view feature spaces. In this paper, we propose a robust multi-view feature learning strategy for accurate image re-ranking. Due to the dramatic variability in visual appearance for different target images, it is necessary to uncover the shared components underlying those query-related instances that are visually unlike for improving the re-ranking accuracy. Consequently, it is reasonable to assume the latent subspace enjoys the low-rank property and thus the subspace recovery can be achieved via the low-rank modeling accordingly. In addition, the real-world data are usually partially contaminated and it is required to appropriately model the sample-dependent data noise. Towards this end, we employ 2,1norm based sparsity constraint to model the sample-specific mapping noise for enhancing the model robustness. In order to produce discriminative representations, we encode a similarity preserving term in our multi-view embedding framework. As a result, the sample separability is maximally maintained in the latent subspace with sufficient discriminative power. The extensive experimental evaluations on public landmark benchmarks reveal that our approach achieves impressive performance superior to the state-of-the-art, which thus demonstrates the efficacy of the proposed method.
Lecture Notes in Computer Science, 2007
arXiv (Cornell University), Feb 16, 2015
Robustness to particular transformations is a desired property in many classification tasks. For ... more Robustness to particular transformations is a desired property in many classification tasks. For example, in image classification tasks the predictions should be invariant to variations in location, size, angle, brightness, etc. Standard neural networks do not have this property. We propose an extension of the backpropagation algorithm that trains a neural network to be robust to variations and noise in the feature vector. This extension consists of an additional forward pass performed on the derivatives that are obtained in the end of the backward pass. We perform a theoretical and experimental comparison with the standard BP, and two other the most similar approaches (Tangent BP and Adversarial Training). As a result, we show how both of them can be sped up on approximately 20%. We evaluate our algorithm on a collection of datasets for image classification, confirm its theoretically established properties and demonstrate an improvement of the classification accuracy with respect to the competing algorithms in the majority of cases.
2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2018
In this paper, we present a novel parallel implementation for training Gradient Boosting Decision... more In this paper, we present a novel parallel implementation for training Gradient Boosting Decision Trees (GBDTs) on Graphics Processing Units (GPUs). Thanks to the wide use of the open sourced XGBoost library, GBDTs have become very popular in recent years and won many awards in machine learning and data mining competitions. Although GPUs have demonstrated their success in accelerating many machine learning applications, there are a series of key challenges of developing a GPU-based GBDT algorithm, including irregular memory accesses, many small sorting operations and varying data parallel granularities in tree construction. To tackle these challenges on GPUs, we propose various novel techniques (including Run-length Encoding compression and thread/block workload dynamic allocation, and reusing intermediate training results for efficient gradient computation). Our experimental results show that our algorithm named GPU-GBDT is often 10 to 20 times faster than the sequential version of XGBoost, and achieves 1.5 to 2 times speedup over a 40 threaded XGBoost running on a relatively high-end workstation of 20 CPU cores. Moreover, GPU-GBDT outperforms its CPU counterpart by 2 to 3 times in terms of performance-price ratio.
International Journal of Microsimulation, 2014
The paper presents the main characteristics of BETAMOD, a static microsimulation model that repro... more The paper presents the main characteristics of BETAMOD, a static microsimulation model that reproduces the Italian personal income tax (IRPEF), as well as local income taxes, namely the regional and municipal surtaxes, building on a detailed reconstruction of tax legislation. With respect to the vast majority of existing tax microsimulation models, the peculiarities of BETAMOD concern two aspects: the inclusion of a detailed set of tax expenditures, and the estimation of individual-specific tax evasion rates, which account for the total individual income level, its composition in terms of income sources, and the geographical area of residence.
Cognitive Computation, 2018
Background and Introduction Tactile recognition enables robots identify target objects or environ... more Background and Introduction Tactile recognition enables robots identify target objects or environments from tactile sensory readings. The recent advancement of deep learning and biological tactile sensing inspire us proposing an end-to-end architecture ROTConvPCE-mv that performs tactile recognition using residual orthogonal tiling and pyramid convolution ensemble. Methods Our approach uses stacks of raw frames and tactile flow as dual input, and incorporates the strength of multi-layer OTConvs (orthogonal tiling convolutions) organized in a residual learning paradigm. We empirically demonstrate that OTConvs have adjustable invariance capability to different input transformations such as translation, rotation, and scaling. To effectively capture multi-scale global context, a pyramid convolution structure is attached to the concatenated output of two residual OTConv pathways. Results and Conclusions The extensive experimental evaluations show that ROTConvPCE-mv outperforms several state-of-the-art methods with a large margin
Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2007), 2007
Providing fault tolerance for message passing parallel application on a distributed environment i... more Providing fault tolerance for message passing parallel application on a distributed environment is a rule rather than an exception. A node failure can cause the whole computation to stop and has to be restarted from the beginning if no fault tolerance is available. However, introducing fault tolerance has some overhead on speedup that can be achieved. In this paper, we introduce a new technique called replication with cross-over packets for reliability and to increase fault tolerance over Very Large Scale Grids (VLSG). This technique has two pronged effect of avoiding single point of failure and single link of failure. We incorporate this new technique into the L-BSP model and show the possible speedup of parallel process. We also derive the achievable speedup for some fundamental parallel algorithms using this technique.
IEEE Transactions on Knowledge and Data Engineering, 2011
Web link analysis methods such as PageRank, HITS, and SALSA have focused on obtaining global popu... more Web link analysis methods such as PageRank, HITS, and SALSA have focused on obtaining global popularity or authority of the set of Web pages in question. Although global popularity is useful for general queries, we find that global popularity is not as useful for queries in which the global population has less knowledge of. By examining the many different communities that appear within a Web page graph, we are able to compute the popularity or authority from a specific community. Multiresolution popularity lists allow us to observe the popularity of Web pages with respect to communities at different resolutions within the Web. Multiresolution popularity lists have been shown to have high potential when compared against PageRank. In this paper, we generalize the multiresolution popularity analysis to use any form of Web page link relations. We provide results for both the PageRank relations and the In-degree relations. By utilizing the multiresolution popularity lists, we achieve a 13 percent and 25 percent improvement in mean average precision over In-degree and PageRank, respectively.
International Journal of Intelligent Systems, 2006
Future Generation Computer Systems, 2018
h i g h l i g h t s • An anomaly detection framework for scientific workflows is presented. • HTM... more h i g h l i g h t s • An anomaly detection framework for scientific workflows is presented. • HTM is used to detect anomalies on a stream of resource consumption time series data. • The HTM-based model is unsupervised and learns incrementally. • The framework is platform-agnostic and can be deployed on different infrastructures. • Detected anomalies can trigger scheduling and resource provisioning actions.
Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12, 2012
In large and complex graphs of social, chemical/biological, or other relations, frequent substruc... more In large and complex graphs of social, chemical/biological, or other relations, frequent substructures are commonly shared by different graphs or by graphs evolving through different time periods. Tensors are natural representations of these complex time-evolving graph data. A factorization of a tensor provides a high-quality low-rank compact basis for each dimension of the tensor, which facilitates the interpretation of frequent substructures of the original graphs. However, the high computational cost of tensor factorization makes it infeasible for conventional tensor factorization methods to handle large graphs that evolve frequently with time. To address this problem, in this paper we propose a novel iterative tensor factorization (ITF) method whose time complexity is linear in the cardinalities of all dimensions of a tensor. This low time complexity means that when using tensors to represent dynamic graphs, the computational cost of ITF is linear in the size (number of edges/vertices) of graphs and is also linear in the number of time periods over which the graph evolves. More importantly, an error estimation of ITF suggests that its factorization correctness is comparable to that of the standard factorization method. We empirically evaluate our method on publication networks and chemical compound graphs, and demonstrate that ITF is an order of magnitude faster than the conventional method and at the same time preserves factorization quality. To the best of our knowledge, this research is the first work that uses important frequent substructures to speed up tensor factorizations for mining dynamic graphs.
NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327)
... However, in our implementation we have used a more sophisticated approach that takes into con... more ... However, in our implementation we have used a more sophisticated approach that takes into consideration how unusual a ... 366 Session Eight Proactive Security Management ... The second major source of packet trace data that we have used is the Auckland I1 dataset from the ...
Journal of Biomedical Informatics, 2017
IEEE Conference Proceedings, 2016
Lecture Notes in Computer Science, Jul 15, 2009
Providing fault tolerance for message passing parallel application on a distributed environment i... more Providing fault tolerance for message passing parallel application on a distributed environment is a rule rather than an exception. A node failure can cause the whole computation to stop and has to be restarted from the beginning if no fault tolerance is available. However, introducing fault tolerance has some overhead on speedup that can be achieved. In this paper, we introduce a new technique called replication with cross-over packets for reliability and to increase fault tolerance over Very Large Scale Grids (VLSG). This technique has two pronged effect of avoiding single point of failure and single link of failure. We incorporate this new technique into the L-BSP model and show the possible speedup of parallel process. We also derive the achievable speedup for some fundamental parallel algorithms using this technique.
Proceedings of the ... AAAI Conference on Artificial Intelligence, Feb 13, 2017
The k-fold cross-validation is commonly used to evaluate the effectiveness of SVMs with the selec... more The k-fold cross-validation is commonly used to evaluate the effectiveness of SVMs with the selected hyper-parameters. It is known that the SVM k-fold cross-validation is expensive, since it requires training k SVMs. However, little work has explored reusing the h th SVM for training the (h + 1) th SVM for improving the efficiency of k-fold cross-validation. In this paper, we propose three algorithms that reuse the h th SVM for improving the efficiency of training the (h + 1) th SVM. Our key idea is to efficiently identify the support vectors and to accurately estimate their associated weights (also called alpha values) of the next SVM by using the previous SVM. Our experimental results show that our algorithms are several times faster than the k-fold cross-validation which does not make use of the previously trained SVM. Moreover, our algorithms produce the same results (hence same accuracy) as the k-fold cross-validation which does not make use of the previously trained SVM.
arXiv (Cornell University), Jun 29, 2011
A large spectrum of applications such as location based services and environmental monitoring dem... more A large spectrum of applications such as location based services and environmental monitoring demand efficient query processing on uncertain databases. In this paper, we propose the probabilistic Voronoi diagram (PVD) for processing moving nearest neighbor queries on uncertain data, namely the probabilistic moving nearest neighbor (PMNN) queries. A PMNN query finds the most probable nearest neighbor of a moving query point continuously. To process PMNN queries efficiently, we provide two techniques: a pre-computation approach and an incremental approach. In the pre-computation approach, we develop an algorithm to efficiently evaluate PMNN queries based on the precomputed PVD for the entire data set. In the incremental approach, we propose an incremental probabilistic safe region based technique that does not require to pre-compute the whole PVD to answer the PMNN query. In this incremental approach, we exploit the knowledge for a known region to compute the lower bound of the probability of an object being the nearest neighbor. Experimental results show that our approaches significantly outperform a sampling based approach by orders of magnitude in terms of I/O, query processing time, and communication overheads.
IEEE Conference Proceedings, 2020
IEEE Transactions on Knowledge and Data Engineering, 2019
Recently, multi-view features have significantly promoted the performance of image re-ranking by ... more Recently, multi-view features have significantly promoted the performance of image re-ranking by providing complementary image descriptions. Without loss of generality, in multi-view re-ranking, multiple heterogeneous visual features of high dimensionality are projected onto a low-dimensional subspace, and thus the resulting latent representation can be used for the subsequent similarity-based ranking. Albeit effective, this standard mechanism underplays the intrinsic structure underlying the latent subspace and does not take into account the substantial noise in the original multi-view feature spaces. In this paper, we propose a robust multi-view feature learning strategy for accurate image re-ranking. Due to the dramatic variability in visual appearance for different target images, it is necessary to uncover the shared components underlying those query-related instances that are visually unlike for improving the re-ranking accuracy. Consequently, it is reasonable to assume the latent subspace enjoys the low-rank property and thus the subspace recovery can be achieved via the low-rank modeling accordingly. In addition, the real-world data are usually partially contaminated and it is required to appropriately model the sample-dependent data noise. Towards this end, we employ 2,1norm based sparsity constraint to model the sample-specific mapping noise for enhancing the model robustness. In order to produce discriminative representations, we encode a similarity preserving term in our multi-view embedding framework. As a result, the sample separability is maximally maintained in the latent subspace with sufficient discriminative power. The extensive experimental evaluations on public landmark benchmarks reveal that our approach achieves impressive performance superior to the state-of-the-art, which thus demonstrates the efficacy of the proposed method.
Lecture Notes in Computer Science, 2007
arXiv (Cornell University), Feb 16, 2015
Robustness to particular transformations is a desired property in many classification tasks. For ... more Robustness to particular transformations is a desired property in many classification tasks. For example, in image classification tasks the predictions should be invariant to variations in location, size, angle, brightness, etc. Standard neural networks do not have this property. We propose an extension of the backpropagation algorithm that trains a neural network to be robust to variations and noise in the feature vector. This extension consists of an additional forward pass performed on the derivatives that are obtained in the end of the backward pass. We perform a theoretical and experimental comparison with the standard BP, and two other the most similar approaches (Tangent BP and Adversarial Training). As a result, we show how both of them can be sped up on approximately 20%. We evaluate our algorithm on a collection of datasets for image classification, confirm its theoretically established properties and demonstrate an improvement of the classification accuracy with respect to the competing algorithms in the majority of cases.
2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2018
In this paper, we present a novel parallel implementation for training Gradient Boosting Decision... more In this paper, we present a novel parallel implementation for training Gradient Boosting Decision Trees (GBDTs) on Graphics Processing Units (GPUs). Thanks to the wide use of the open sourced XGBoost library, GBDTs have become very popular in recent years and won many awards in machine learning and data mining competitions. Although GPUs have demonstrated their success in accelerating many machine learning applications, there are a series of key challenges of developing a GPU-based GBDT algorithm, including irregular memory accesses, many small sorting operations and varying data parallel granularities in tree construction. To tackle these challenges on GPUs, we propose various novel techniques (including Run-length Encoding compression and thread/block workload dynamic allocation, and reusing intermediate training results for efficient gradient computation). Our experimental results show that our algorithm named GPU-GBDT is often 10 to 20 times faster than the sequential version of XGBoost, and achieves 1.5 to 2 times speedup over a 40 threaded XGBoost running on a relatively high-end workstation of 20 CPU cores. Moreover, GPU-GBDT outperforms its CPU counterpart by 2 to 3 times in terms of performance-price ratio.
International Journal of Microsimulation, 2014
The paper presents the main characteristics of BETAMOD, a static microsimulation model that repro... more The paper presents the main characteristics of BETAMOD, a static microsimulation model that reproduces the Italian personal income tax (IRPEF), as well as local income taxes, namely the regional and municipal surtaxes, building on a detailed reconstruction of tax legislation. With respect to the vast majority of existing tax microsimulation models, the peculiarities of BETAMOD concern two aspects: the inclusion of a detailed set of tax expenditures, and the estimation of individual-specific tax evasion rates, which account for the total individual income level, its composition in terms of income sources, and the geographical area of residence.
Cognitive Computation, 2018
Background and Introduction Tactile recognition enables robots identify target objects or environ... more Background and Introduction Tactile recognition enables robots identify target objects or environments from tactile sensory readings. The recent advancement of deep learning and biological tactile sensing inspire us proposing an end-to-end architecture ROTConvPCE-mv that performs tactile recognition using residual orthogonal tiling and pyramid convolution ensemble. Methods Our approach uses stacks of raw frames and tactile flow as dual input, and incorporates the strength of multi-layer OTConvs (orthogonal tiling convolutions) organized in a residual learning paradigm. We empirically demonstrate that OTConvs have adjustable invariance capability to different input transformations such as translation, rotation, and scaling. To effectively capture multi-scale global context, a pyramid convolution structure is attached to the concatenated output of two residual OTConv pathways. Results and Conclusions The extensive experimental evaluations show that ROTConvPCE-mv outperforms several state-of-the-art methods with a large margin
Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2007), 2007
Providing fault tolerance for message passing parallel application on a distributed environment i... more Providing fault tolerance for message passing parallel application on a distributed environment is a rule rather than an exception. A node failure can cause the whole computation to stop and has to be restarted from the beginning if no fault tolerance is available. However, introducing fault tolerance has some overhead on speedup that can be achieved. In this paper, we introduce a new technique called replication with cross-over packets for reliability and to increase fault tolerance over Very Large Scale Grids (VLSG). This technique has two pronged effect of avoiding single point of failure and single link of failure. We incorporate this new technique into the L-BSP model and show the possible speedup of parallel process. We also derive the achievable speedup for some fundamental parallel algorithms using this technique.
IEEE Transactions on Knowledge and Data Engineering, 2011
Web link analysis methods such as PageRank, HITS, and SALSA have focused on obtaining global popu... more Web link analysis methods such as PageRank, HITS, and SALSA have focused on obtaining global popularity or authority of the set of Web pages in question. Although global popularity is useful for general queries, we find that global popularity is not as useful for queries in which the global population has less knowledge of. By examining the many different communities that appear within a Web page graph, we are able to compute the popularity or authority from a specific community. Multiresolution popularity lists allow us to observe the popularity of Web pages with respect to communities at different resolutions within the Web. Multiresolution popularity lists have been shown to have high potential when compared against PageRank. In this paper, we generalize the multiresolution popularity analysis to use any form of Web page link relations. We provide results for both the PageRank relations and the In-degree relations. By utilizing the multiresolution popularity lists, we achieve a 13 percent and 25 percent improvement in mean average precision over In-degree and PageRank, respectively.
International Journal of Intelligent Systems, 2006
Future Generation Computer Systems, 2018
h i g h l i g h t s • An anomaly detection framework for scientific workflows is presented. • HTM... more h i g h l i g h t s • An anomaly detection framework for scientific workflows is presented. • HTM is used to detect anomalies on a stream of resource consumption time series data. • The HTM-based model is unsupervised and learns incrementally. • The framework is platform-agnostic and can be deployed on different infrastructures. • Detected anomalies can trigger scheduling and resource provisioning actions.
Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12, 2012
In large and complex graphs of social, chemical/biological, or other relations, frequent substruc... more In large and complex graphs of social, chemical/biological, or other relations, frequent substructures are commonly shared by different graphs or by graphs evolving through different time periods. Tensors are natural representations of these complex time-evolving graph data. A factorization of a tensor provides a high-quality low-rank compact basis for each dimension of the tensor, which facilitates the interpretation of frequent substructures of the original graphs. However, the high computational cost of tensor factorization makes it infeasible for conventional tensor factorization methods to handle large graphs that evolve frequently with time. To address this problem, in this paper we propose a novel iterative tensor factorization (ITF) method whose time complexity is linear in the cardinalities of all dimensions of a tensor. This low time complexity means that when using tensors to represent dynamic graphs, the computational cost of ITF is linear in the size (number of edges/vertices) of graphs and is also linear in the number of time periods over which the graph evolves. More importantly, an error estimation of ITF suggests that its factorization correctness is comparable to that of the standard factorization method. We empirically evaluate our method on publication networks and chemical compound graphs, and demonstrate that ITF is an order of magnitude faster than the conventional method and at the same time preserves factorization quality. To the best of our knowledge, this research is the first work that uses important frequent substructures to speed up tensor factorizations for mining dynamic graphs.
NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327)
... However, in our implementation we have used a more sophisticated approach that takes into con... more ... However, in our implementation we have used a more sophisticated approach that takes into consideration how unusual a ... 366 Session Eight Proactive Security Management ... The second major source of packet trace data that we have used is the Auckland I1 dataset from the ...