permGPU : Using graphics processing units in RNA microarray association studies (original) (raw)

CARAT-GxG: CUDA-Accelerated Regression Analysis Toolkit for Large-Scale Gene-Gene Interaction with GPU Computing System

Cancer informatics, 2014

In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene-gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is...

DSIMBench: A Benchmark for Microarray Data Using R

Lecture Notes in Computer Science, 2014

Parallel computing in R has been widely used to analyse microarray data. We have seen various applications using various data distribution and calculation approaches. Newer data storage systems, such as MySQL Cluster and HBase, have been proposed for R data storage; while the parallel computation frameworks, including MPI and MapReduce, have been applied to R computation. Thus, it is difficult to understand the whole analysis workflows for which the tool kits are suited for a specific environment. In this paper we propose DSIMBench, a benchmark containing two classic microarray analysis functions with eight different parallel R workflows, and evaluate the benchmark in the IC Cloud testbed platform.

AMDA: an R package for the automated microarray data analysis

BMC bioinformatics, 2006

Microarrays are routinely used to assess mRNA transcript levels on a genome-wide scale. Large amount of microarray datasets are now available in several databases, and new experiments are constantly being performed. In spite of this fact, few and limited tools exist for quickly and easily analyzing the results. Microarray analysis can be challenging for researchers without the necessary training and it can be time-consuming for service providers with many users. To address these problems we have developed an automated microarray data analysis (AMDA) software, which provides scientists with an easy and integrated system for the analysis of Affymetrix microarray experiments. AMDA is free and it is available as an R package. It is based on the Bioconductor project that provides a number of powerful bioinformatics and microarray analysis tools. This automated pipeline integrates different functions available in the R and Bioconductor projects with newly developed functions. AMDA covers ...

GPU acceleration for statistical gene classification

2010

The use of Bioinformatic tools in routine clinical diagnostics is still facing a number of issues. The more complex and advanced bioinformatic tools become, the more performance is required by the computing platforms. Unfortunately, the cost of parallel computing platforms is usually prohibitive for both public and small private medical practices. This paper presents a successful experience in using the parallel processing capabilities of Graphical Processing Units (GPU) to speed up bioinformatic tasks such as statistical classification of gene expression profiles. The results show that using open source CUDA programming libraries allows to obtain a significant increase in performances and therefore to shorten the gap between advanced bioinformatic tools and real medical practice.

AMDA 2.13: A major update for automated cross-platform microarray data analysis

BioTechniques, 2012

Microarray platforms require analytical pipelines with modules for data pre-processing including data normalization, statistical analysis for identification of differentially expressed genes, cluster analysis, and functional annotation. We previously developed the Automated Microarray Data Analysis (AMDA, version 2.3.5) pipeline to process Affymetrix 3' IVT GeneChips. The availability of newer technologies that demand open-source tools for microarray data analysis has impelled us to develop an updated multi-platform version, AMDA 2.13. It includes additional quality control metrics, annotation-driven (annotation grade of Affymetrix NetAffx) and signal-driven (Inter-Quartile Range) gene filtering, and approaches to experimental design. To enhance understanding of biological data, differentially expressed genes have been mapped into KEGG pathways. Finally, a more stable and user-friendly interface was designed to integrate the requirements for different platforms. AMDA 2.13 allows...

Bioinformatics Tools Enabling U-Statistics for Microarrays

2006

It is rare that a single gene is sufficient to represent all aspects of genomic activity. Similarly, most common diseases cannot be explained by a mutations at a single locus. Since complex systems tend to be neither linear nor hierarchical in nature, but to have correlated components of unknown relative importance, the assumptions of traditional (parametric) multivariate statistical methods can rarely be justified on theoretical grounds. Empirical "validation" is not only problematic, but also time consuming. Here we demonstrates how bioinformatics tools, ranging from spreadsheets to grids, can enable u-statistics as a non-parametric alternative for scoring multivariate ordinal data. Applications are shown to improve assessment of genetic risk factors, quality control of microarrays and signal value estimation, scoring genomic profiles that best correlated with complex risk factors (cardiovascular diseases), and complex responses to an intervention (treatment of psoriasis).

Comparison and meta-analysis of microarray data: from the bench to the computer desk

Trends in Genetics, 2003

The upcoming availability of public microarray repositories and of large compendia of gene expression information opens up a new realm of possibilities for microarray data analysis. An essential challenge is the efficient integration of microarray data generated by different research groups on different array platforms. This review focuses on the problems associated with this integration, which are: (1) the efficient access to and exchange of microarray data; (2) the validation and comparison of data from different platforms (cDNA and short and long oligonucleotides); and (3) the integrated statistical analysis of multiple data sets.

HDBStat!: a platform-independent software suite for statistical analysis of high dimensional biology data

BMC bioinformatics, 2005

Many efforts in microarray data analysis are focused on providing tools and methods for the qualitative analysis of microarray data. HDBStat! (High-Dimensional Biology-Statistics) is a software package designed for analysis of high dimensional biology data such as microarray data. It was initially developed for the analysis of microarray gene expression data, but it can also be used for some applications in proteomics and other aspects of genomics. HDBStat! provides statisticians and biologists a flexible and easy-to-use interface to analyze complex microarray data using a variety of methods for data preprocessing, quality control analysis and hypothesis testing. Results generated from data preprocessing methods, quality control analysis and hypothesis testing methods are output in the form of Excel CSV tables, graphs and an Html report summarizing data analysis. HDBStat! is a platform-independent software that is freely available to academic institutions and non-profit organization...

CIDA: An integrated software for the design, characterisation and global comparison of microarrays

Journal of Integrative Bioinformatics

Microarray technology has had a significant impact in the field of systems biology involving the investigation into the biological systems that regulate human life. Identifying genes of significant interest within any given disease on an individual basis is no doubt time consuming and inefficient when considering the complexity of the human genome. Thus, the genetic profiling of the entire human genome in a single experiment has resulted in microarray technology becoming a widely used experimental tool. However, without the use of tools for several aspects of microarray data analysis the technology is limited. To date, no such tool has been developed that allows the integration of numerous microarray results from different research laboratories as well as the design of customised gene chips in a cost-effective manner. In light of this, we have designed the first integrated and automated software called Chip Integration, Design and Annotation (CIDA) for the cross comparison, design a...