Chris Evelo | Maastricht University (original) (raw)
Uploads
Papers by Chris Evelo
Journal of Intellectual Disability Research, 2017
Nature Communications, 2019
The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistan... more The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca’s large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibit...
Archives of Toxicology, 1992
Monitoring activities have an important role in occupational and environmental practice. Techniqu... more Monitoring activities have an important role in occupational and environmental practice. Techniques to detect and to control chemical exposure in order to protect people from environmental and occupational illness are rapidly expanding. Most obvious are the methods related to the evaluation of the presence of xenobiotic chemical agents either in the contaminated environment or in the exposed organism. In the first case the procedures are indicated as environmental monitoring (EM). In inhalatory exposure, estimations can be ...
F1000Research, 2015
We describe a new national organisation in scientific research that facilitates life scientists w... more We describe a new national organisation in scientific research that facilitates life scientists with technologies and technological expertise in an era where new projects often are data-intensive, multi-disciplinary, and multi-site. The Dutch Techcentre for Life Sciences (DTL, www.dtls.nl) is run as a lean not-for-profit organisation of which research organisations (both academic and industrial) are paying members. The small staff of the organisation undertakes a variety of tasks that are necessary to perform or support modern academic research, but that are not easily undertaken in a purely academic setting. DTL also represents the Netherlands in the ELIXIR ESFRI, and the office supports this task. The organisation is still being fine-tuned and this will probably continue over time, as it is crucial for this kind of organisation to adapt to a constantly changing environment. However, already being underway for several years on the path to professionalisation, our experiences can be...
Bioinformatics, 2014
Motivation: The field of toxicogenomics (the application of ‘-omics’ technologies to risk assessm... more Motivation: The field of toxicogenomics (the application of ‘-omics’ technologies to risk assessment of compound toxicities) has expanded in the last decade, partly driven by new legislation, aimed at reducing animal testing in chemical risk assessment but mainly as a result of a paradigm change in toxicology towards the use and integration of genome wide data. Many research groups worldwide have generated large amounts of such toxicogenomics data. However, there is no centralized repository for archiving and making these data and associated tools for their analysis easily available. Results: The Data Infrastructure for Chemical Safety Assessment (diXa) is a robust and sustainable infrastructure storing toxicogenomics data. A central data warehouse is connected to a portal with links to chemical information and molecular and phenotype data. diXa is publicly available through a user-friendly web interface. New data can be readily deposited into diXa using guidelines and templates ava...
Toxicology in Vitro, 1992
The effects were studied of improved oxygen supply on the integrity and metabolic activity toward... more The effects were studied of improved oxygen supply on the integrity and metabolic activity towards dimethylacetamide of the isolated perfused rat liver. Improvement of oxygen supply by increased medium oxygenation or addition of chemical oxygen carriers (perftuortributylamine) or erythrocytes led to increased bile secretion. Leakage of lactate dehydrogenase and aspartate aminotransferase could be prevented during a 1-hr perfusion when either chemical oxygen carriers or erythrocytes were added. Improved medium oxygenation alone was not sufficient to prevent high enzyme leakage during the second half of the perfusion period. Histological evaluation confirmed the conclusion that less damage occurred when erythrocytes or perfluortributylamine were added to the perfusion medium. The metabolic clearance of dimethylacetamide by the perfused rat liver was not significantly improved when erythrocytes were added to the medium. The results show that addition of perfluortributylamine, or erythrocytes at a level of 4 g haemoglobin/litre, is necessary to maintain liver integrity for at least 1 hr in the liver perfusion system used in this study.
Physiological Genomics, 2011
Obesity frequently leads to insulin resistance and the development of hepatic steatosis. To chara... more Obesity frequently leads to insulin resistance and the development of hepatic steatosis. To characterize the molecular changes that promote hepatic steatosis, transcriptomics, proteomics, and metabolomics technologies were applied to liver samples from C57BL/6J mice obtained from two independent intervention trials. After 12 wk of high-fat feeding the animals became obese, hyperglycemic, and insulin resistant, had elevated levels of blood cholesterol and VLDL, and developed hepatic steatosis. Nutrigenomic analysis revealed alterations of key metabolites and enzyme transcript levels of hepatic one-carbon metabolism and related pathways. The hepatic oxidative capacity and the lipid milieu were significantly altered, which may play a key role in the development of insulin resistance. Additionally, high choline levels were observed after the high-fat diet. Previous studies have linked choline levels with insulin resistance and hepatic steatosis in conjunction with changes of certain met...
The Journal of Pathology, 2006
Recently, we showed that cathepsin K deficiency reduces atherosclerotic plaque progression, induc... more Recently, we showed that cathepsin K deficiency reduces atherosclerotic plaque progression, induces plaque fibrosis, but aggravates macrophage foam cell formation in the ApoE -/- mouse. To obtain more insight into the molecular mechanisms by which cathepsin K disruption evokes the observed phenotypic changes, we used microarray analysis for gene expression profiling of aortic arches of CatK -/-/ApoE -/- and ApoE -/- mice on a mouse oligo microarray. Out of 20 280 reporters, 444 were significantly differentially expressed (p-value of < 0.05, fold change of > or = 1.4 or < or = - 1.4, and intensity value of > 2.5 times background in at least one channel). Ingenuity Pathway Analysis and GenMAPP revealed upregulation of genes involved in lipid uptake, trafficking, and intracellular storage, including caveolin - 1, - 2, - 3 and CD36, and profibrotic genes involved in transforming growth factor beta (TGFbeta) signalling, including TGFbeta2, latent TGFbeta binding protein-1 (LTBP1), and secreted protein, acidic and rich in cysteine (SPARC), in CatK -/-/ApoE -/- mice. Differential gene expression was confirmed at the mRNA and protein levels. In vitro modified low density lipoprotein (LDL) uptake assays, using bone marrow derived macrophages preincubated with caveolae and scavenger receptor inhibitors, confirmed the importance of caveolins and CD36 in increasing modified LDL uptake in the absence of cathepsin K. In conclusion, we suggest that cathepsin K deficiency alters plaque phenotype not only by decreasing proteolytic activity, but also by stimulating TGFbeta signalling. Besides this profibrotic effect, cathepsin K deficiency has a lipogenic effect owing to increased lipid uptake mediated by CD36 and caveolins.
European Respiratory Journal, 1999
Changes in levels of catalase and glutathione in erythrocytes of patients with stable asthma, tre... more Changes in levels of catalase and glutathione in erythrocytes of patients with stable asthma, treated with beclomethasone dipropionate.
European Journal of Gastroenterology & Hepatology, 2006
European Journal of Clinical Investigation, 2006
People interested in the research are advised to contact the author for the final version of the ... more People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:
CMLS Cellular and Molecular Life Sciences, 2005
The increased incidence of obesity and related disorders in Western societies requires a thorough... more The increased incidence of obesity and related disorders in Western societies requires a thorough understanding of the adipogenic process. Data at the protein level of this process are scarce. Therefore we performed a proteome analysis of differentiating and starving 3T3-L1 cells using two-dimensional gel electrophoresis combined with mass spectrometry. Effects of different starvation conditions were examined by subjecting 3T3-L1 adipocytes to caloric restriction, either in the absence or the presence of the lipolysis inducer tumor necrosis factor-a. Ninety-three differentially expressed proteins were
BMC Systems Biology, 2011
Background: Complex phenotypes such as insulin resistance involve different biological pathways t... more Background: Complex phenotypes such as insulin resistance involve different biological pathways that may interact and influence each other. Interpretation of related experimental data would be facilitated by identifying relevant pathway interactions in the context of the dataset. Results: We developed an analysis approach to study interactions between pathways by integrating gene and protein interaction networks, biological pathway information and high-throughput data. This approach was applied to a transcriptomics dataset to investigate pathway interactions in insulin resistant mouse liver in response to a glucose challenge. We identified regulated pathway interactions at different time points following the glucose challenge and also studied the underlying protein interactions to find possible mechanisms and key proteins involved in pathway cross-talk. A large number of pathway interactions were found for the comparison between the two diet groups at t = 0. The initial response to the glucose challenge (t = 0.6) was typed by an acute stress response and pathway interactions showed large overlap between the two diet groups, while the pathway interaction networks for the late response were more dissimilar. Conclusions: Studying pathway interactions provides a new perspective on the data that complements established pathway analysis methods such as enrichment analysis. This study provided new insights in how interactions between pathways may be affected by insulin resistance. In addition, the analysis approach described here can be generally applied to different types of high-throughput data and will therefore be useful for analysis of other complex datasets as well.
5th ArrayNL meeting, January 22, 2002, Utrecht, The Netherlands, 2002
KNAW Narcis. Back to search results. Publication Hidden treasures, Data treatment necessary to fi... more KNAW Narcis. Back to search results. Publication Hidden treasures, Data treatment necessary to find patterns in... (2002). Pagina-navigatie: Main. ...
International Journal of Obesity, 2004
KNAW Narcis. Back to search results. Publication Transcriptionnal effects of exogenous leptin adm... more KNAW Narcis. Back to search results. Publication Transcriptionnal effects of exogenous leptin administration in... (2004) Open access. Pagina-navigatie: Main. ...
Abstract. Within complex scientific domains such as pharmacology, operational equivalence between... more Abstract. Within complex scientific domains such as pharmacology, operational equivalence between two concepts is often context-, user-and task-specific. Existing Linked Data integration procedures and equivalence services do not take the context and task of the user into account. We present a vision for enabling users to control the notion of operational equivalence by applying scientific lenses over Linked Data. The scientific lenses vary the links that are activated between the datasets which affects the data returned to the user.
Journal of Intellectual Disability Research, 2017
Nature Communications, 2019
The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistan... more The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca’s large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibit...
Archives of Toxicology, 1992
Monitoring activities have an important role in occupational and environmental practice. Techniqu... more Monitoring activities have an important role in occupational and environmental practice. Techniques to detect and to control chemical exposure in order to protect people from environmental and occupational illness are rapidly expanding. Most obvious are the methods related to the evaluation of the presence of xenobiotic chemical agents either in the contaminated environment or in the exposed organism. In the first case the procedures are indicated as environmental monitoring (EM). In inhalatory exposure, estimations can be ...
F1000Research, 2015
We describe a new national organisation in scientific research that facilitates life scientists w... more We describe a new national organisation in scientific research that facilitates life scientists with technologies and technological expertise in an era where new projects often are data-intensive, multi-disciplinary, and multi-site. The Dutch Techcentre for Life Sciences (DTL, www.dtls.nl) is run as a lean not-for-profit organisation of which research organisations (both academic and industrial) are paying members. The small staff of the organisation undertakes a variety of tasks that are necessary to perform or support modern academic research, but that are not easily undertaken in a purely academic setting. DTL also represents the Netherlands in the ELIXIR ESFRI, and the office supports this task. The organisation is still being fine-tuned and this will probably continue over time, as it is crucial for this kind of organisation to adapt to a constantly changing environment. However, already being underway for several years on the path to professionalisation, our experiences can be...
Bioinformatics, 2014
Motivation: The field of toxicogenomics (the application of ‘-omics’ technologies to risk assessm... more Motivation: The field of toxicogenomics (the application of ‘-omics’ technologies to risk assessment of compound toxicities) has expanded in the last decade, partly driven by new legislation, aimed at reducing animal testing in chemical risk assessment but mainly as a result of a paradigm change in toxicology towards the use and integration of genome wide data. Many research groups worldwide have generated large amounts of such toxicogenomics data. However, there is no centralized repository for archiving and making these data and associated tools for their analysis easily available. Results: The Data Infrastructure for Chemical Safety Assessment (diXa) is a robust and sustainable infrastructure storing toxicogenomics data. A central data warehouse is connected to a portal with links to chemical information and molecular and phenotype data. diXa is publicly available through a user-friendly web interface. New data can be readily deposited into diXa using guidelines and templates ava...
Toxicology in Vitro, 1992
The effects were studied of improved oxygen supply on the integrity and metabolic activity toward... more The effects were studied of improved oxygen supply on the integrity and metabolic activity towards dimethylacetamide of the isolated perfused rat liver. Improvement of oxygen supply by increased medium oxygenation or addition of chemical oxygen carriers (perftuortributylamine) or erythrocytes led to increased bile secretion. Leakage of lactate dehydrogenase and aspartate aminotransferase could be prevented during a 1-hr perfusion when either chemical oxygen carriers or erythrocytes were added. Improved medium oxygenation alone was not sufficient to prevent high enzyme leakage during the second half of the perfusion period. Histological evaluation confirmed the conclusion that less damage occurred when erythrocytes or perfluortributylamine were added to the perfusion medium. The metabolic clearance of dimethylacetamide by the perfused rat liver was not significantly improved when erythrocytes were added to the medium. The results show that addition of perfluortributylamine, or erythrocytes at a level of 4 g haemoglobin/litre, is necessary to maintain liver integrity for at least 1 hr in the liver perfusion system used in this study.
Physiological Genomics, 2011
Obesity frequently leads to insulin resistance and the development of hepatic steatosis. To chara... more Obesity frequently leads to insulin resistance and the development of hepatic steatosis. To characterize the molecular changes that promote hepatic steatosis, transcriptomics, proteomics, and metabolomics technologies were applied to liver samples from C57BL/6J mice obtained from two independent intervention trials. After 12 wk of high-fat feeding the animals became obese, hyperglycemic, and insulin resistant, had elevated levels of blood cholesterol and VLDL, and developed hepatic steatosis. Nutrigenomic analysis revealed alterations of key metabolites and enzyme transcript levels of hepatic one-carbon metabolism and related pathways. The hepatic oxidative capacity and the lipid milieu were significantly altered, which may play a key role in the development of insulin resistance. Additionally, high choline levels were observed after the high-fat diet. Previous studies have linked choline levels with insulin resistance and hepatic steatosis in conjunction with changes of certain met...
The Journal of Pathology, 2006
Recently, we showed that cathepsin K deficiency reduces atherosclerotic plaque progression, induc... more Recently, we showed that cathepsin K deficiency reduces atherosclerotic plaque progression, induces plaque fibrosis, but aggravates macrophage foam cell formation in the ApoE -/- mouse. To obtain more insight into the molecular mechanisms by which cathepsin K disruption evokes the observed phenotypic changes, we used microarray analysis for gene expression profiling of aortic arches of CatK -/-/ApoE -/- and ApoE -/- mice on a mouse oligo microarray. Out of 20 280 reporters, 444 were significantly differentially expressed (p-value of < 0.05, fold change of > or = 1.4 or < or = - 1.4, and intensity value of > 2.5 times background in at least one channel). Ingenuity Pathway Analysis and GenMAPP revealed upregulation of genes involved in lipid uptake, trafficking, and intracellular storage, including caveolin - 1, - 2, - 3 and CD36, and profibrotic genes involved in transforming growth factor beta (TGFbeta) signalling, including TGFbeta2, latent TGFbeta binding protein-1 (LTBP1), and secreted protein, acidic and rich in cysteine (SPARC), in CatK -/-/ApoE -/- mice. Differential gene expression was confirmed at the mRNA and protein levels. In vitro modified low density lipoprotein (LDL) uptake assays, using bone marrow derived macrophages preincubated with caveolae and scavenger receptor inhibitors, confirmed the importance of caveolins and CD36 in increasing modified LDL uptake in the absence of cathepsin K. In conclusion, we suggest that cathepsin K deficiency alters plaque phenotype not only by decreasing proteolytic activity, but also by stimulating TGFbeta signalling. Besides this profibrotic effect, cathepsin K deficiency has a lipogenic effect owing to increased lipid uptake mediated by CD36 and caveolins.
European Respiratory Journal, 1999
Changes in levels of catalase and glutathione in erythrocytes of patients with stable asthma, tre... more Changes in levels of catalase and glutathione in erythrocytes of patients with stable asthma, treated with beclomethasone dipropionate.
European Journal of Gastroenterology & Hepatology, 2006
European Journal of Clinical Investigation, 2006
People interested in the research are advised to contact the author for the final version of the ... more People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:
CMLS Cellular and Molecular Life Sciences, 2005
The increased incidence of obesity and related disorders in Western societies requires a thorough... more The increased incidence of obesity and related disorders in Western societies requires a thorough understanding of the adipogenic process. Data at the protein level of this process are scarce. Therefore we performed a proteome analysis of differentiating and starving 3T3-L1 cells using two-dimensional gel electrophoresis combined with mass spectrometry. Effects of different starvation conditions were examined by subjecting 3T3-L1 adipocytes to caloric restriction, either in the absence or the presence of the lipolysis inducer tumor necrosis factor-a. Ninety-three differentially expressed proteins were
BMC Systems Biology, 2011
Background: Complex phenotypes such as insulin resistance involve different biological pathways t... more Background: Complex phenotypes such as insulin resistance involve different biological pathways that may interact and influence each other. Interpretation of related experimental data would be facilitated by identifying relevant pathway interactions in the context of the dataset. Results: We developed an analysis approach to study interactions between pathways by integrating gene and protein interaction networks, biological pathway information and high-throughput data. This approach was applied to a transcriptomics dataset to investigate pathway interactions in insulin resistant mouse liver in response to a glucose challenge. We identified regulated pathway interactions at different time points following the glucose challenge and also studied the underlying protein interactions to find possible mechanisms and key proteins involved in pathway cross-talk. A large number of pathway interactions were found for the comparison between the two diet groups at t = 0. The initial response to the glucose challenge (t = 0.6) was typed by an acute stress response and pathway interactions showed large overlap between the two diet groups, while the pathway interaction networks for the late response were more dissimilar. Conclusions: Studying pathway interactions provides a new perspective on the data that complements established pathway analysis methods such as enrichment analysis. This study provided new insights in how interactions between pathways may be affected by insulin resistance. In addition, the analysis approach described here can be generally applied to different types of high-throughput data and will therefore be useful for analysis of other complex datasets as well.
5th ArrayNL meeting, January 22, 2002, Utrecht, The Netherlands, 2002
KNAW Narcis. Back to search results. Publication Hidden treasures, Data treatment necessary to fi... more KNAW Narcis. Back to search results. Publication Hidden treasures, Data treatment necessary to find patterns in... (2002). Pagina-navigatie: Main. ...
International Journal of Obesity, 2004
KNAW Narcis. Back to search results. Publication Transcriptionnal effects of exogenous leptin adm... more KNAW Narcis. Back to search results. Publication Transcriptionnal effects of exogenous leptin administration in... (2004) Open access. Pagina-navigatie: Main. ...
Abstract. Within complex scientific domains such as pharmacology, operational equivalence between... more Abstract. Within complex scientific domains such as pharmacology, operational equivalence between two concepts is often context-, user-and task-specific. Existing Linked Data integration procedures and equivalence services do not take the context and task of the user into account. We present a vision for enabling users to control the notion of operational equivalence by applying scientific lenses over Linked Data. The scientific lenses vary the links that are activated between the datasets which affects the data returned to the user.
An integrated bioinformatics approach to evaluate genomics responses in liver model systems used ... more An integrated bioinformatics approach to evaluate genomics responses in liver model systems used to study drug metabolism.
Chris T Evelo, Department of Bioinformatics – BiGCaT, Maastricht University, The Netherlands
Large scale genomics studies can access different levels of gene expression like epigenetic regulation through DNA methylation, transcription factor binding through Chromatide Immune Precipitation (ChIP) and direct mRNA expression measurements. For each of these processes microarray and high throughput sequencing techniques are now readily available. This kind of genomics experiments typically yields long lists of affected genes that are hard to understand, and this does also happen in genotoxicity studies using metabolically active liver based test systems. Such long lists of affected genes, which can often contain thousands of genes, are hard to understand. Only looking at the genes that show the largest changes means that the research focuses only on the top of the iceberg.
Pathway analysis is one of the most fruitful approaches in situations where genomics studies yield numbers of affected genes that are hard to interpret directly. It can be used to identify processes that are affected most (1) and it also helps to solve the typical oversampling problem in genetics. The reasoning behind that is that while many false positives can occur in genome wide studies because of the sheer number of genes studied, it is still statistically very unlikely that many of such false positive genes randomly occur in the same biological pathway.
To enable pathway analysis we of course need the biological pathways themselves in such a format that they can be used by pathway analysis tools. This means for instance that the genes and metabolites in the pathways must be annotated with gene and metabolite identifiers that can be understood by such programs. We developed wikipathways.org (2) as a community curation site where experts, and in principle anyone else, can create and improve pathways.
Despite the availability of large amounts of biological information on xenobiotic biotransformation, the number of available biotransformation pathway maps that can easily be used for visualization of multiple omics data is limited. We created integrated biotransformation pathway maps suitable for multiple omics analysis using PathVisio (3). The ease of visualizing data on these maps was demonstrated by using published microarray data from human hepatocyte-like cell models. Since intact metabolic activity is a prerequisite for a well suited biotransformation model, this example shows how the biotransformation pathway maps can be used for model selection (4).
1. Thomas Kelder, Bruce R Conklin, Chris T Evelo, Alexander R Pico (2010). Finding the right questions: Exploratory Pathway Analysis to Enhance Biological Discovery in Large Datasets. PLoS Biol 8: 8. e1000472 Sept. http://dx.doi.org/doi:10.1371/journal.pbio.1000472
2. Alexander R Pico, Thomas Kelder, Martijn P van Iersel, Kristina Hanspers, Bruce R Conklin, Chris Evelo (2008). WikiPathways: pathway editing for the people. PLoS Biol 6: 7. e184. http://dx.doi.org/10.1371/journal.pbio.0060184
3. Martijn P van Iersel, Thomas Kelder, Alexander R Pico, Kristina Hanspers, Susan Coort, Bruce R Conklin, Chris Evelo (2008). Presenting and exploring biological pathways with PathVisio. BMC Bioinformatics 9: 399 Sep. http://dx.doi.org/10.1186/1471-2105-9-399
4. D G Jennen, S Gaj, P J Giesbertz, J H van Delft, C T Evelo, J C Kleinjans (2010). Biotransformation pathway maps in WikiPathways enable direct visualization of drug metabolism related expression changes. Drug Discovery Today 15: 19, 20. 851-858 Oct. http://dx.doi.org/10.1016/j.drudis.2010.08.002
Integrative Systems Biology: How to Deal with Large Scale Genetics Data. Chris T Evelo; Departm... more Integrative Systems Biology: How to Deal with Large Scale Genetics Data.
Chris T Evelo; Department of Bioinformatics – BiGCaT; Maastricht University, The Netherlands
Until recently analysis of genomics results was mainly about transcriptomics related data. That already confronted us with overwhelming analytical problems. We learned to mathematically and statistically treat genome wide expression studies and studies directed to gene expression regulation. Genomics researchers had to become bilingual speaking: English and R* and learned to think about co-expression, clusters and false discovery rates. The latter in fact proved to be a trap. Removing all the false positives made us loose the information we were really interested in. To understand the results of our genomics experiments we often had to confront what we were measuring with what we already knew. After all false positives are not likely to all be related to the same meaningful biological process. That asked for the development of new analytical tools like Cytoscape for network analysis and PathVisio (1) for pathway analysis. More importantly we had to structure what we know. Text mining and data mining helped us to do that, but what was really needed was mobilization of all the knowledge that is present in the heads of the scientific community. WikiPathways was our contribution to the rapidly emerging field of community curation (2).
Today the story repeats itself. Genome wide genetics is becoming real. We can do Genome Wide Association Studies, even individuals can have their genome evaluated for a million SNPs, and soon we can sequence individual genomes in relation to phenotypic responses. Then what? How can we deal with that new avalanche of data? The oversampling problems will be a few orders of magnitude larger; after all there can be hundreds of SNPs in every gene. There will just be too many to understand which SNPs are important from the data alone. We will again have to relate them to the biological processes. But is that enough? I think not. We will only understand the outcome of those large scale genetics studies if we not only attribute the SNPs to genes and thereby to pathways. We will also have to consider the actual sequences and see what the functional effect is that the SNP causes. Is it likely to influence transcription factor binding, miRNA effects, or protein-protein interactions? This calls for new types of data integration, for which luckily we already have the tools.
Next to that we of course need to know what the polymorphisms really do. What is for instance the association between specific genetic variations, diet and phenotypic outcome? Large initiatives like the micronutrient genomics program and the human variome project now find each other to evaluate this type of relationships (3). And this is important because the problem calls for strong teams of creative minds.
1. Martijn P van Iersel, Thomas Kelder, Alexander R Pico, Kristina Hanspers, Susan Coort, Bruce R Conklin, Chris Evelo (2008). Presenting and exploring biological pathways with PathVisio. BMC Bioinformatics 9: 399 Sep. http://dx.doi.org/10.1186/1471-2105-9-399
2. Alexander R Pico, Thomas Kelder, Martijn P van Iersel, Kristina Hanspers, Bruce R Conklin, Chris Evelo (2008). WikiPathways: pathway editing for the people. PLoS Biol 6: 7. e184. http://dx.doi.org/10.1371/journal.pbio.0060184
3. Jim Kaput, Chris T Evelo, Giuditta Perozzi, Ben van Ommen, Richard Cotton (2010). Connecting The human variome project to nutrigenomics. Genes Nutr Online first October 15. http://dx.doi.org/10.1007/s12263-010-0186-6
* R is a programming language for statistics, often used in bioinformatics. There are many dedicated statistical packages already available as part of the Bioconductor collection (see: http://www.bioconductor.org/).
Biology is rapidly developing into a data-driven science and faces not only the challenge of copi... more Biology is rapidly developing into a data-driven science and faces not only the challenge of coping with an ever growing amount of data but also that of interpreting its complex diversity. The requirement of systems biology to connect different levels of biological research leads directly to a need for large scale data integration. The nutritional phenotype database (dbNP) addresses this challenge for nutrigenomics. A particularly urgent objective in coping with the data avalanche is making biologically meaningful information accessible to the researcher. In this presentation we will describe how we intend to meet this objective with the nutritional phenotype database. We will outline relevant parts of the system architecture, describe the kinds of data to be managed, and show how the system can support retrieval of biological meaningful information by means of biological profiles, pathways and ontologies in full-text and structured queries. Our presentation will point out critical points and will describe several technical hurdles and demonstrate how pathway analysis in the omics modules of the nutritional phenotype database can improve the functionality of queries and comparisons of nutrition studies. Directions for future research will be given. Through development of a ranking system for the results of free text queries we will aim to improve the user interaction with dbNP. Profiles describing the relevant changes in biological pathways and GO levels will be used in a mathematical way to calculate distances between the biological outcomes of experiments and will allow the user to ask intuitive questions like “what experiments showed an effect on apoptosis?” or “which other studies showed an effect that looked like mine?”.
Until recently analysis of genomics results was mainly about transcriptomics related data. That a... more Until recently analysis of genomics results was mainly about transcriptomics related data. That already confronted us with overwhelming analytical problems. We learned to mathematically and statistically treat genome wide expression studies and studies directed to gene expression regulation. Genomics researchers had to become bilingual speaking: English and R1 and learned to think about co-expression, clusters and false discovery rates. The latter in fact proofed to be a trap. Removing all the false positives made us loose the information we were really interested in. To understand the results of our genomics experiments we often had to confront what we were measuring with what we already knew. After all false positives are not likely to all be related to the same meaningful biological process. That asked for the development of new analytical tools like Cytoscape for network analysis and PathVisio for pathway analysis. More importantly we had to structure what we know. Text mining and data mining helped us to do that, but what was really needed was mobilization of all the knowledge that is present in the heads of the scientific community. WikiPathways was our contribution to the rapidly emerging field of community curation. Thus we started to become able to integrate different types of technologies that span the full gene expression pipeline and to understand that in the biological context.
Today the story repeats itself. Genome wide genetics is becoming real. We can do Genome Wide Association Studies and soon we can sequence individual genomes in relation to phenotypic responses. And then what? How can we deal with that new avalanche of data? The oversampling problems will be a few orders of magnitude larger; after all there can be hundreds of SNPs in every gene. There will just be too many to understand which SNPs are important from the data alone. We will again have to relate them to the biological processes. But is that enough? I think not. We will only understand the outcome of those large scale genetics studies if we not only attribute the SNPs to genes and thereby to pathways. We will also have to consider the actual sequences and see what the functional effect is that the SNP causes. Is it likely to influence transcription factor binding, miRNA effects, or protein-protein interactions? This calls for new types of data integration, for which we already have the tools. And it calls for new creative ways to do that. What we really need is teams of creative minds. Some new initiatives seem to show that these are already being formed.
- R is a programming language for statistics, often used in bioinformatics. There are many dedicated statistical packages already available as part of the Bioconductor collection (see: http://www.bioconductor.org/).