THINK Back: KNowledge-based Interpretation of High Throughput data - PubMed (original) (raw)

THINK Back: KNowledge-based Interpretation of High Throughput data

Fernando Farfán et al. BMC Bioinformatics. 2012.

Abstract

Results of high throughput experiments can be challenging to interpret. Current approaches have relied on bulk processing the set of expression levels, in conjunction with easily obtained external evidence, such as co-occurrence. While such techniques can be used to reason probabilistically, they are not designed to shed light on what any individual gene, or a network of genes acting together, may be doing. Our belief is that today we have the information extraction ability and the computational power to perform more sophisticated analyses that consider the individual situation of each gene. The use of such techniques should lead to qualitatively superior results. The specific aim of this project is to develop computational techniques to generate a small number of biologically meaningful hypotheses based on observed results from high throughput microarray experiments, gene sequences, and next-generation sequences. Through the use of relevant known biomedical knowledge, as represented in published literature and public databases, we can generate meaningful hypotheses that will aide biologists to interpret their experimental data. We are currently developing novel approaches that exploit the rich information encapsulated in biological pathway graphs. Our methods perform a thorough and rigorous analysis of biological pathways, using complex factors such as the topology of the pathway graph and the frequency in which genes appear on different pathways, to provide more meaningful hypotheses to describe the biological phenomena captured by high throughput experiments, when compared to other existing methods that only consider partial information captured by biological pathways.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Example of density analysis on biological pathways. Two example pathways with differentially expressed genes appearing in different configurations. A pathway with differentially expressed genes appearing tightly-clustered in one portion of the graph is more significant than a pathway in which the differentially expressed genes appear spread out.

Figure 2

Figure 2

Distribution of appearance frequency of genes in KEGG Pathway database. The x axis indicates the appearance frequency of each gene in KEGG Pathways. The y axis shows the proportion of genes with indicated appearance frequency out of all KEGG Pathway genes. The total number of genes at each appearance count is indicated.

Figure 3

Figure 3

THINK-Back Software Architecture. The THINK-Back suite of tools has been developed to allow scientists worldwide to use our gene set enrichment testing methods and to combine them with previously-developed methods as adjustment tools. These tools can be accessed via web services, an application programmer's interface (API), and soon we will be providing a web-based user interface as well.

Figure 4

Figure 4

THINK-Back Web Service Call Diagram. The sequence diagram shows the steps to call the THINK-Back Web service, request the job status, and finally obtain the results. After submitting the job for execution, the web service returns a unique identifier that should be used to check the job status. Once the status returned is DONE, the user can check for the results of the execution.

Similar articles

Cited by

References

    1. Brown P, Botstein D. Exploring the new world of the genome with DNA microarrays. Nature Genetics. 1999;21(1 Suppl):33–37. - PubMed
    1. Larkin J, Frank B, Gavras H, Sultana R, Quackenbush J. Independence and reproducibility across microarray platforms. Nature Methods. 2005;2(5):337–344. doi: 10.1038/nmeth757. - DOI - PubMed
    1. Draghici S, Khatri P, Eklund A, Szallasi Z. Reliability and reproducibility issues in DNA microarray measurements. TRENDS in Genetics. 2006;22(2):101–109. doi: 10.1016/j.tig.2005.12.005. - DOI - PMC - PubMed
    1. Tusher V, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A. 2001;98(9):5116–5121. doi: 10.1073/pnas.091062498. - DOI - PMC - PubMed
    1. Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3:Article 3. - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources