Deep convolutional neural networks for annotating gene expression patterns in the mouse brain (original) (raw)

Convolutional Neural Network Approach to Predict Tumor Samples Using Gene Expression Data

2021

Cancer is threatening millions of people each year and its early diagnosis is still a challenging task. Early diagnosis is one of the major ways to tackle the disease and lower the mortality rate. Advancements in deep learning approaches and the availability of biological data offer applications that can facilitate the diagnosis and characterization of cancer. Here, we aimed to provide a new perspective of cancer diagnosis using a deep learning approach on gene expression data. In this study, RNA-Seq data of approximately 30 different types of cancer patients the Cancer Genome Atlas (TCGA) study, and normal tissue RNA-Seq data from GTEx were used. The input data for the training was transformed to RGB format and the training was carried out with a Convolutional Neural Network (CNN). The trained algorithm is able to predict cancer with 97% accuracy, using gene expression data. In conclusion, our study shows that the deep learning approach and biological data have a huge potential in the diagnosis and identification of tumor samples.

Deep Learning for Multi-Tissue Cancer Classification of Gene Expressions (GeneXNet)

IEEE Access

Cancer classification using gene expressions is extremely challenging given the complexity and high dimensionality of the data. Current classification methods typically rely on samples collected from a single tissue type and perform a prerequisite of gene feature selection to avoid processing the full set of genes. These methods fall short in taking advantage of genome-wide next generation sequencing technologies which provide a snapshot of the whole transcriptome rather than a predetermined subset of genes. We propose a deep learning framework for cancer diagnosis by developing a multi-tissue cancer classifier based on wholetranscriptome gene expressions collected from multiple tumor types covering multiple organ sites. We introduce a new Convolutional Neural Network architecture called Gene eXpression Network (GeneXNet), which is specifically designed to address the complex nature of gene expressions. Our proposed GeneXNet provides capabilities of detecting genetic alterations driving cancer progression by learning genomic signatures across multiple tissue types without requiring the prerequisite of gene feature selection. Our model achieves 98.9% classification accuracy on human samples representing 33 different cancer tumor types across 26 organ sites. We demonstrate how our model can be used for transfer learning to build classifiers for tumors lacking sufficient samples to be trained independently. We introduce visualization procedures to provide biological insight on how our model is performing classification across multiple tumors.

Learning & Visualizing Genomic Signatures of Cancer Tumors using Deep Neural Networks

2020

Deep learning for medical diagnosis using genomics is extremely challenging given the high dimensionality of the data and lack of sufficient patient samples. Another challenge is that deep models are conceived as black boxes without much interpretation on how these complex models make predictions. We propose a deep transfer learning framework for cancer diagnosis with the capability of learning the sequence of DNA and RNA in cancer cells and identifying genetic changes that alter cell behavior and cause uncontrollable growth and malignancy. We design a new Convolutional Neural Network architecture with capabilities of learning the genomic signatures of whole-transcriptome gene expressions collected from multiple tumor types covering multiple organ sites. We demonstrate how our trained model can function as a comprehensive multi-tissue cancer classifier by using transfer learning to build classifiers for tumors lacking sufficient human samples to be trained independently. We introduce visualization procedures to provide more biological insight on how our model is learning genomic signatures and accurately making predictions across multiple cancer tissue types.

Learning Visualizing Genomic Signatures of Cancer Tumors using Deep Neural Networks

2020 International Joint Conference on Neural Networks (IJCNN)

Deep learning for medical diagnosis using genomics is extremely challenging given the high dimensionality of the data and lack of sufficient patient samples. Another challenge is that deep models are conceived as black boxes without much interpretation on how these complex models make predictions. We propose a deep transfer learning framework for cancer diagnosis with the capability of learning the sequence of DNA and RNA in cancer cells and identifying genetic changes that alter cell behavior and cause uncontrollable growth and malignancy. We design a new Convolutional Neural Network architecture with capabilities of learning the genomic signatures of whole-transcriptome gene expressions collected from multiple tumor types covering multiple organ sites. We demonstrate how our trained model can function as a comprehensive multi-tissue cancer classifier by using transfer learning to build classifiers for tumors lacking sufficient human samples to be trained independently. We introduce visualization procedures to provide more biological insight on how our model is learning genomic signatures and accurately making predictions across multiple cancer tissue types.

Gene Expression Classification Based on Deep Learning

IEEE, 2020

One of the most significant research topics in bioinformatics is the classification of gene expression. Gene expression data commonly have a large number of features and a small number of samples. The gene expression data are very different from one to another, this differentiation among data and the feature's large number make the classification for gene expression data challenging. In this study, for classification we assessed the accuracy for most powerful deep learning's algorithms such as Deep Neural Network, Recurrent Neural Network, Convolutional Neural Network and improved Deep Neural Network with the preprocessing technique. The DNN was improved by adding Dropout to it by which the overfitting problem was overcame. Our results showed that the proposed improved-DNN outperforms the other algorithms among all used datasets.

Supervised and Unsupervised End-to-End Deep Learning for Gene Ontology Classification of Neural In Situ Hybridization Images

Entropy, 2019

In recent years, large datasets of high-resolution mammalian neural images have become available, which has prompted active research on the analysis of gene expression data. Traditional image processing methods are typically applied for learning functional representations of genes, based on their expressions in these brain images. In this paper, we describe a novel end-to-end deep learning-based method for generating compact representations of in situ hybridization (ISH) images, which are invariant-to-translation. In contrast to traditional image processing methods, our method relies, instead, on deep convolutional denoising autoencoders (CDAE) for processing raw pixel inputs, and generating the desired compact image representations. We provide an in-depth description of our deep learning-based approach, and present extensive experimental results, demonstrating that representations extracted by CDAE can help learn features of functional gene ontology categories for their classificat...

Identification of Differentially Expressed Genes Using Deep Learning in Bioinformatics

Advances in Intelligent Systems and Computing, 2020

Bioinformatics data can be used for the ultimate prediction of diseases in different organisms. The microarray technology is a special form of 2D representation of genomic data characterized by an enormous number of genes across a handful of samples. The actual analysis of this data involves extraction or selection of the relevant genes from this vast amount of irrelevant and redundant data. These genes can be further used to predict classes of unknown samples. In this work, we have implemented two popular deep learning segmentation architectures, namely, SegNet and U-Net. These techniques have been applied to the microarray dataset of colon cancer (typically containing tumour and normal tissue samples) to extract the culprit/responsible gene. The performance of the reduced set formed from these genes has been compared across different classifiers using different existing methods of feature selection. It is found that both deep learning based approaches outperform the other methods. Lastly, the biological significance of the genes has also been verified using ontological tools, and the results are significant.

Deep Learning Based Tumor Type Classification Using Gene Expression Data

Differential analysis occupies the most significant portion of the standard practices of RNA-Seq analysis. However, the conventional method is matching the tumor samples to the normal samples, which are both from the same tumor type. The output using such method would fail in differentiating tumor types because it lacks the knowledge from other tumor types. Pan-Cancer Atlas provides us with abundant information on 33 prevalent tumor types which could be used as prior knowledge to generate tumor-specific biomarkers. In this paper, we embedded the high dimensional RNA-Seq data into 2-D images and used a convolutional neural network to make classification of the 33 tumor types. The final accuracy we got was 95.59%, higher than another paper applying GA/KNN method on the same dataset. Based on the idea of Guided Grad Cam, as to each class, we generated significance heat-map for all the genes. By doing functional analysis on the genes with high intensities in the heat-maps, we validated ...

Deep Learning-Based Pan-Cancer Classification Model Reveals Tissue-of-Origin Specific Gene Expression Signatures

Cancers, 2022

Cancer tissue-of-origin specific biomarkers are needed for effective diagnosis, monitoring, and treatment of cancers. In this study, we analyzed transcriptomics data from 37 cancer types provided by The Cancer Genome Atlas (TCGA) to identify cancer tissue-of-origin specific gene expression signatures. We developed a deep neural network model to classify cancers based on gene expression data. The model achieved a predictive accuracy of >97% across cancer types indicating the presence of distinct cancer tissue-of-origin specific gene expression signatures. We interpreted the model using Shapley additive explanations to identify specific gene signatures that significantly contributed to cancer-type classification. We evaluated the model and the validity of gene signatures using an independent test data set from the International Cancer Genome Consortium. In conclusion, we present a robust neural network model for accurate classification of cancers based on gene expression data and a...

Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms

Breast Cancer Research, 2021

Background Transcriptome sequencing has been broadly available in clinical studies. However, it remains a challenge to utilize these data effectively for clinical applications due to the high dimension of the data and the highly correlated expression between individual genes. Methods We proposed a method to transform RNA sequencing data into artificial image objects (AIOs) and applied convolutional neural network (CNN) algorithms to classify these AIOs. With the AIO technique, we considered each gene as a pixel in an image and its expression level as pixel intensity. Using the GSE96058 (n = 2976), GSE81538 (n = 405), and GSE163882 (n = 222) datasets, we created AIOs for the subjects and designed CNN models to classify biomarker Ki67 and Nottingham histologic grade (NHG). Results With fivefold cross-validation, we accomplished a classification accuracy and AUC of 0.821 ± 0.023 and 0.891 ± 0.021 for Ki67 status. For NHG, the weighted average of categorical accuracy was 0.820 ± 0.012, ...