HCLSIG BioRDF Subgroup/QueryFederation2 - W3C Wiki (original) (raw)

Microarray Use Case

Editors: KeiCheung MScottMarshall

Introduction

This use case explores the federation of microarray data and related data using the Semantic Web. Some areas of interest include the following:

Currently, we are identifying the types of metadata (contextual information) that are needed for describing the experiments, samples, and gene lists in a way that is useful to domain scientists. In doing this, we are also identifying ontologies that contain the relevant terms and relationships. This use case may provide an opportunity to facilitate interaction/collaboration among different communities (e.g., semantic web, ontology, neuroscience and microarray)

Examples

Below are the citations and abstracts of four microarray experiments found in the NIH Neuroscience Microarray Consortium.

Dunckley T, Beach TG, Ramsey KE, Grover A, Mastroeni D, Walker DG, Lafleur BJ, Coon KD, Brown KM, Caselli R, Kukull W, Higdon R, McKeel D, Morris JC, Hulette C, Schmechel D, Reiman EM, Rogers J, Stephan DA., Gene expression correlates of neurofibrillary tangles in Alzheimer's disease; Neurobiol Aging, 2005, 27(10):1359-71. (pubmed link; Gene List)

Neurofibrillary tangles (NFT) constitute one of the cardinal histopathological features of Alzheimer's disease (AD). To explore in vivo molecular processes involved in the development of NFTs, we compared gene expression profiles of NFT-bearing entorhinal cortex neurons from 19 AD patients, adjacent non-NFT-bearing entorhinal cortex neurons from the same patients, and non-NFT-bearing entorhinal cortex neurons from 14 non-demented, histopathologically normal controls (ND). Of the differentially expressed genes, 225 showed progressively increased expression (AD NFT neurons > AD non-NFT neurons > ND non-NFT neurons) or progressively decreased expression (AD NFT neurons < AD non-NFT neurons < ND non-NFT neurons), raising the possibility that they may be related to the early stages of NFT formation. Immunohistochemical studies confirmed that many of the implicated proteins are dysregulated and preferentially localized to NFTs, including apolipoprotein J, interleukin-1 receptor-associated kinase 1, tissue inhibitor of metalloproteinase 3, and casein kinase 2, beta. Functional validation studies are underway to determine which candidate genes may be causally related to NFT neuropathology, thus providing therapeutic targets for the treatment of AD.

Liang WS, Dunckley T, Beach TG, Grover A, Mastroeni D, Walker DG, Caselli RJ, Kukull WA, McKeel D, Morris JC, Hulette C, Schmechel D, Alexander GE, Reiman EM, Rogers J, Stephan DA , Gene expression profiles in anatomically and functionally distinct regions of the normal aged brain; Physiological Genomics, 2007, 3; 28:311-322. (pubmed link)

In this article, we have characterized and compared gene expression profiles from laser capture microdissected neurons in six functionally and anatomically distinct regions from clinically and histopathologically normal aged human brains. These regions, which are also known to be differentially vulnerable to the histopathological and metabolic features of Alzheimer's disease (AD), include the entorhinal cortex and hippocampus (limbic and paralimbic areas vulnerable to early neurofibrillary tangle pathology in AD), posterior cingulate cortex (a paralimbic area vulnerable to early metabolic abnormalities in AD), temporal and prefrontal cortex (unimodal and heteromodal sensory association areas vulnerable to early neuritic plaque pathology in AD), and primary visual cortex (a primary sensory area relatively spared in early AD). These neuronal profiles will provide valuable reference information for future studies of the brain, in normal aging, AD and other neurological and psychiatric disorders.

Liang WS, Reiman EM, Valla J, Dunckley T, Beach TG, Grover A, Niedzielko TL, Schneider LE, Mastroeni D, Caselli R, Kukull W, Morris JC, Hulette CM, Schmechel D, Rogers J, Stephan DA., Alzheimer's disease is associated with reduced expression of energy metabolism genes in posterior cingulate neurons.; Proc Natl Acad Sci U S A., 2008, 11; 105:4441-6. (pubmed link)

Alzheimer's disease (AD) is associated with regional reductions in fluorodeoxyglucose positron emission tomography (FDG PET) measurements of the cerebral metabolic rate for glucose, which may begin long before the onset of histopathological or clinical features, especially in carriers of a common AD susceptibility gene. Molecular evaluation of cells from metabolically affected brain regions could provide new information about the pathogenesis of AD and new targets at which to aim disease-slowing and prevention therapies. Data from a genome-wide transcriptomic study were used to compare the expression of 80 metabolically relevant nuclear genes from laser-capture microdissected non-tangle-bearing neurons from autopsy brains of AD cases and normal controls in posterior cingulate cortex, which is metabolically affected in the earliest stages; other brain regions metabolically affected in PET studies of AD or normal aging; and visual cortex, which is relatively spared. Compared with controls, AD cases had significantly lower expression of 70% of the nuclear genes encoding subunits of the mitochondrial electron transport chain in posterior cingulate cortex, 65% of those in the middle temporal gyrus, 61% of those in hippocampal CA1, 23% of those in entorhinal cortex, 16% of those in visual cortex, and 5% of those in the superior frontal gyrus. Western blots confirmed underexpression of those complex I-V subunits assessed at the protein level. Cerebral metabolic rate for glucose abnormalities in FDG PET studies of AD may be associated with reduced neuronal expression of nuclear genes encoding subunits of the mitochondrial electron transport chain.

Greene JG, Dingledine R, and Greenamyre JT, Gene expression profiling of rat midbrain dopamine neurons: implications for selective vulnerability in parkinsonism; Neurobiol Dis, 2005, February:18(1): 19-31. (pubmed link)

To elucidate factors related to selective dopamine neuron degeneration in Parkinson's disease (PD), we have defined gene expression profiles of discrete dopamine neuron subpopulations in the rat using immunofluorescent laser capture microscopy and microarray analysis. Although profiles were remarkably similar, there are concerted categorical differences in gene expression between dopamine neurons that might explain their differential susceptibility. As a group, energy metabolism transcripts are more highly expressed in substantia nigra (SN) dopamine neurons, an intriguing result considering previous evidence for a mitochondrial defect in idiopathic PD and the greater susceptibility of SN dopamine neurons to damage by mitochondrial poisons. Examination of putative transcription factor binding sites suggests that these concerted differences may be related to differential activity of specific transcription factors. These results provide the first large scale description of gene expression profiles of dopamine neurons and suggest several avenues for investigation into dopaminergic neuroprotective therapy for PD.

Concepts/Terms

Below are some representative concepts/terms that are found in the above examples.

These concepts/terms are defined in different ontologies/vocabularies. Using these concepts/terms and their relationships, one can discover semantically (biologically) related experiments so that some integrative analyses of the associated datasets can be performed. The researchers may also be interested in knowing what genes (or what types of genes) have been found significantly expressed under certain experimental conditions. Below are several example queries:

Additional example queries (2010-April-4) include:

[1]

Example Queries

RDF Structure

We aim to have a simple ontology to describe the example gene lists published at:

This is a tentative first draft of a candidate gene list template as represented in RDF.

//Eric's example - a doggy with bad breath

provenir:has_parameter . provenir:derivedFrom  ; biordf:disease :Alzheimer's ; biordf:diseaseStage :earlyAlzheimer's .

//As was agreed, disease should be a subClassOf healthState; also, biordf:disease may have stages, for example a cancer stage biordf:disease rdfs:subClassOf biordf:healthState .

//The differencialGeneList should derive from the raw data files generated as a result of experiment1:

provenir:data_collection . biordf:computed_from . members (, ) .

//Annotation of RawArrayFilesExp1; relevant when the array platform used in each raw data file is not the same within the same experiment

members (<ABC1.CEL>, <ABC2.CEL>, ... ) . <ABC1.CEL> mged:ArrayGroup affy:U133A . <ABC2.CEL> mged:ArrayGroup affy:U133plus2.0 .

//Annotation of individual genes in the gene list; since this is in the context of an experiment, indicating whether they were over or under expressed is important

biordf:geneLabel "UBC" ; biordf:gene_annotation_link http://www.ncbi.nlm.nih.gov/gene/7316 ; biordf:expressedHumanProtein http://www.uniprot.org/uniprot/P63279, http://www.uniprot.org/uniprot/P61081 ; biordf:gene_expression_value_context :overexpressed ; biordf:associatedDisease http://trustworthy_known_disease_database/Alzheimer .

biordf:geneLabel "VDAC3" ; biordf:gene_annotation_link http://www.ncbi.nlm.nih.gov/gene/7419 .

TODO Items

Old version of this use case