TISSUES 2.0: an integrative web resource on mammalian tissue expression - PubMed (original) (raw)

TISSUES 2.0: an integrative web resource on mammalian tissue expression

Oana Palasca et al. Database (Oxford). 2018.

Erratum in

Abstract

Abstract: Physiological and molecular similarities between organisms make it possible to translate findings from simpler experimental systems—model organisms—into more complex ones, such as human. This translation facilitates the understanding of biological processes under normal or disease conditions. Researchers aiming to identify the similarities and differences between organisms at the molecular level need resources collecting multi-organism tissue expression data. We have developed a database of gene–tissue associations in human, mouse, rat and pig by integrating multiple sources of evidence: transcriptomics covering all four species and proteomics (human only), manually curated and mined from the scientific literature. Through a scoring scheme, these associations are made comparable across all sources of evidence and across organisms. Furthermore, the scoring produces a confidence score assigned to each of the associations. The TISSUES database (version 2.0) is publicly accessible through a user-friendly web interface and as part of the STRING app for Cytoscape. In addition, we analyzed the agreement between datasets, across and within organisms, and identified that the agreement is mainly affected by the quality of the datasets rather than by the technologies used or organisms compared.

Database url: http://tissues.jensenlab.org/

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Schematic representation for calculating fold enrichment and fitting its relationship to raw expression values. For a given dataset, we select gene–tissue pairs and their raw expression scores for a subset of tissues, which are common between all datasets and a subset of genes common between the dataset and the gold standard. Next, we sort these gene–tissue pairs by their raw expression value, and traverse them in sliding windows of a pre-defined size. The enrichment corresponding to each bin is then calculated as the fraction of gene–tissue pairs from that bin found in the gold standard, divided by the fraction of pairs that would be expected by random. Next, we use appropriate functions (see text) to fit the relationship between fold enrichment values and mean raw expression values in their corresponding windows.

Figure 2.

Figure 2.

Summary of tissues present in each dataset. We mapped the newly integrated datasets from mouse, rat and pig, as well as the already existing human datasets, to 21 major tissues of interest. This figure shows which of these tissues are covered by which datasets.

Figure 3.

Figure 3.

In order to obtain confidence scores for each gene–tissue pair, we assess the relationship between raw expression values and fold enrichment, defined as the agreement between each dataset and gold standard datasets specific to the organism. The gold standard datasets are based on the UniProtKB protein tissue annotations in human, filtered for 1-to-1 orthologs between human and each of the three organisms. The _x_-axis contains raw expression values for gene–tissue pairs, in units specific to the type of experiment or processing of data (e.g. intensity units for microarray studies, FPKMs for RNA-seq), and averaged across bins of 100 pairs.

Figure 4.

Figure 4.

(A) Pearson’s correlation coefficients between final confidence scores of gene–tissue associations across datasets. For each pair of datasets, we considered the set of common genes (genes being expressed in at least one tissue in each dataset), and common tissues between the two datasets. (B) In this panel, we compare the rates of True positives and False positives for each dataset for the common tissues to show that the correlation between datasets is mainly influenced by quality rather than by organism or technology.

Figure 5.

Figure 5.

Summary figures for all the covered organisms. The web interface provides a comprehensive figure for each organism where the tissue associations for the queried gene are summarized. In this example, we are showing the tissue expression profile for the Microtubule-associated protein tau (MAPT) known to be related to Alzheimer’s disease (57). The ortholog–paralog table provides information about homologous proteins and their tissue–expression correlation with the query protein.

Figure 6.

Figure 6.

STRING app in Cytoscape with tissue information. (A) The newly developed stringApp for Cytoscape allows users to get all the STRING functionality within Cytoscape and allows expression evidence from TISSUES to be visualized onto the network (in this example for liver). (B) The stringApp also shows the evidence score for each of the major tissues in the node attributes table.

Similar articles

Cited by

References

    1. Hunter P. (2008) The Paradox of Model Organisms. The Use of Model Organisms in Research Will Continue despite Their Shortcomings. EMBO Rep., 9, 717–720. - PMC - PubMed
    1. Aitman T.J., Boone C., Churchill G.A., et al. (2011) The future of model organisms in human disease research. Nat. Rev. Genet., 12, 575–582. - PubMed
    1. Mackay T.F.C. (2014) Epistasis and quantitative traits: using model organisms to study gene-gene interactions. Nat. Rev. Genet., 15, 22–33. - PMC - PubMed
    1. Greaves P., Williams A., Eve M. (2004) First dose of potential new medicines to humans: how animals help. Nat. Rev. Drug Disc., 3, 226–236. - PubMed
    1. Boverhof D.R., Chamberlain M.P., Elcombe C.R.. et al. (2011) Transgenic animal models in toxicology: historical perspectives and future outlook. Toxicol. Sci., 121, 207–233. - PubMed

Publication types

MeSH terms

LinkOut - more resources