Roger Day - Academia.edu (original) (raw)
Papers by Roger Day
Figure S1. Study schema. (TIF 18129 kb)
Figure S5. Linear model AUC for â leave one outâ validation of training set. (TIFF 16875 kb)
Figure S4. Genomic mutational landscape of melanoma cohort. (TIFF 16875 kb)
There is a sense of promise that accelerating growth of knowledge about the molecular basis for t... more There is a sense of promise that accelerating growth of knowledge about the molecular basis for the behavior of cancer cells, especially about the cancer genome, will lead, if not to the magic bullet, at least to much better treatments for patients. But there is a tension between pure empiricism pinning hopes on the sheer quantity of data and the use of biological reasoning to draw insights and improve predictions. Our premise is that at least some of the latter approach is essential, and a general cancer modeling system could be key to pulling the information together in a truly useful way. A comprehensive software-based facility to synthesize information, build models, simulate, and validate is in development. We have built a comprehensive modeling system for the cancer process, integrating the constituent multiple interacting processes at different scales, including the cancer cell, the patient, the oncologist, and the clinical trial. This is called the Oncology Thinking Cap (Onc...
Title Retrieval from the DAVID bioinformatics data resource into R
The Enfin-Encore (EnCore) is the integration platform for the ENFIN European Network of Excellenc... more The Enfin-Encore (EnCore) is the integration platform for the ENFIN European Network of Excellence [1], which provides a portal to various database resources with a special focus on systems biology ( http://code.google.com/p/enfin-core/). EnCore is appealing to developers of bioinformatics applications, because of the scope and variety of EnCore’s annotation resources, and because its collection of web services∗ utilizes a common standard format (EnXML: http://code.google.com/p/enfincore/wiki/wp1 encore enxml). Web services can communicate client applications using a variety of programming languages. Many bioinformaticians work in R, so an R-based solution is desirable. The ENVISIONQuery package provides programmatic access to the EnCore web services in R. EnCore’s capabilities evolve rapidly, so the architecture of ENVISIONQuery enables rapid integration of the new services when they appear.
Journal for ImmunoTherapy of Cancer, 2019
Journal for immunotherapy of cancer, Jan 9, 2018
Immune checkpoint inhibitors (ICIs) have changed the clinical management of melanoma. However, no... more Immune checkpoint inhibitors (ICIs) have changed the clinical management of melanoma. However, not all patients respond, and current biomarkers including PD-L1 and mutational burden show incomplete predictive performance. The clinical validity and utility of complex biomarkers have not been studied in melanoma. Cutaneous metastatic melanoma patients at eight institutions were evaluated for PD-L1 expression, CD8 T-cell infiltration pattern, mutational burden, and 394 immune transcript expression. PD-L1 IHC and mutational burden were assessed for association with overall survival (OS) in 94 patients treated prior to ICI approval by the FDA (historical-controls), and in 137 patients treated with ICIs. Unsupervised analysis revealed distinct immune-clusters with separate response rates. This comprehensive immune profiling data were then integrated to generate a continuous Response Score (RS) based upon response criteria (RECIST v.1.1). RS was developed using a single institution trainin...
British Journal of Cancer, 2005
Journal of pathology informatics, 2017
The University of Pittsburgh's Department of Biomedical Informatics and Division of Pathology... more The University of Pittsburgh's Department of Biomedical Informatics and Division of Pathology Informatics created a Science, Technology, Engineering, and Mathematics (STEM) pipeline in 2011 dedicated to providing cutting-edge informatics research and career preparatory experiences to a diverse group of highly motivated high-school students. In this third editorial installment describing the program, we provide a brief overview of the pipeline, report on achievements of the past scholars, and present results from self-reported assessments by the 2015 cohort of scholars. The pipeline continues to expand with the 2015 addition of the innovation internship, and the introduction of a program in 2016 aimed at offering first-time research experiences to undergraduates who are underrepresented in pathology and biomedical informatics. Achievements of program scholars include authorship of journal articles, symposium and summit presentations, and attendance at top 25 universities. All of ...
Amia Annual Symposium Proceedings Amia Symposium Amia Symposium, 2010
This study explored the possibility that semantic distance metrics can be used to develop methods... more This study explored the possibility that semantic distance metrics can be used to develop methods for auditing biomedical ontologies. We developed and tested an approach using the Foundational Model of Anatomy (FMA) and the body-structure taxonomy of SNOMED CT. We evaluated 190 class pairs in human anatomical structures using three semantic distance metrics: simple edge count, normalized path length, and information content. We applied principal component analysis (PCA) to study relationships between the semantic distance measurements so produced in FMA and SNOMED CT. We found that our application of PCA could detect significant discrepancies, but not necessarily outright mistakes, in the two ontologies. A review of discrepancies revealed that they often relate to multiple design perspectives employed in ontological definitions.
This is the first part of a two-parted report on development of a statistical learning algorithm ... more This is the first part of a two-parted report on development of a statistical learning algorithm for a latent variable model referred to as cooperative vector quantizer model. This part presents the theory and mathematical derivations of a variational Bayesian learning algorithm for the model. The model has general applications in the field of machine learning and signal processing. For example it can be used to solve the problem of blind source separation or image separation. Our special interest is in its potential biological application in that we can use the model to simulate signal transduction components regulating gene expression as latent variables. The algorithm is capable of automatically and efficiently determining the number of latent variables of the model, estimating the distribution of the parameters and latent variables. Thus, we can use the model to address following biological questions regarding gene expression regulation: (1) What are the key signal transduction components regulating gene expression in a given kind of cell; (2) How many key components are needed to efficiently encode information for gene expression regulation; (3) What are the states of the key components for a given gene expression data point. Such information will provide insight for understanding the mechanism of information organization of cells, mechanism of diseases and drug effect/toxicity. 2
Cancer Informatics, 2015
Data quality is a recognized problem for high-throughput genomics platforms, as evinced by the pr... more Data quality is a recognized problem for high-throughput genomics platforms, as evinced by the proliferation of methods attempting to filter out lower quality data points. Different filtering methods lead to discordant results, raising the question, which methods are best? Astonishingly, little computational support is offered to analysts to decide which filtering methods are optimal for the research question at hand. To evaluate them, we begin with a pair of expression data sets, transcriptomic and proteomic, on the same samples. The pair of data sets form a test-bed for the evaluation. Identifier mapping between the data sets creates a collection of feature pairs, with correlations calculated for each pair. To evaluate a filtering strategy, we estimate posterior probabilities for the correctness of probesets accepted by the method. An analyst can set expected utilities that represent the trade-off between the quality and quantity of accepted features. We tested nine published prob...
Journal of Clinical Oncology, Sep 1, 2000
This paper describes a comprehensive biomathematical cancer modeling facility, and its use in mee... more This paper describes a comprehensive biomathematical cancer modeling facility, and its use in meeting the challenges of developing better cancer treatment strategies by exploiting the cancer biology knowledge explosion thoroughly. Using information about the biology of cancer cells in the treatment of patients requires synthesizing this information to draw conclusions about the whole tumor/patient relationship, hopefully pointing to better treatments. Oncology Thinking Cap (OncoTCAP) does the required synthesis by providing both a Monte Carlo simulation engine and an analytic engine providing the joint probability generating function and therefore specifically the probability of cure. 1. INTRODUCTION In this paper we describe a cancer modeling facility, OncoTCAP, and how it can be used to develop better cancer treatment strategies. 1.1. Rapidly developing knowledge in cancer biology and treatment The ongoing enormous explosion of research providing detailed information about the ge...
Proceedings of the 1999 conference on Computer support for collaborative learning - CSCL '99, 1999
Figure S1. Study schema. (TIF 18129 kb)
Figure S5. Linear model AUC for â leave one outâ validation of training set. (TIFF 16875 kb)
Figure S4. Genomic mutational landscape of melanoma cohort. (TIFF 16875 kb)
There is a sense of promise that accelerating growth of knowledge about the molecular basis for t... more There is a sense of promise that accelerating growth of knowledge about the molecular basis for the behavior of cancer cells, especially about the cancer genome, will lead, if not to the magic bullet, at least to much better treatments for patients. But there is a tension between pure empiricism pinning hopes on the sheer quantity of data and the use of biological reasoning to draw insights and improve predictions. Our premise is that at least some of the latter approach is essential, and a general cancer modeling system could be key to pulling the information together in a truly useful way. A comprehensive software-based facility to synthesize information, build models, simulate, and validate is in development. We have built a comprehensive modeling system for the cancer process, integrating the constituent multiple interacting processes at different scales, including the cancer cell, the patient, the oncologist, and the clinical trial. This is called the Oncology Thinking Cap (Onc...
Title Retrieval from the DAVID bioinformatics data resource into R
The Enfin-Encore (EnCore) is the integration platform for the ENFIN European Network of Excellenc... more The Enfin-Encore (EnCore) is the integration platform for the ENFIN European Network of Excellence [1], which provides a portal to various database resources with a special focus on systems biology ( http://code.google.com/p/enfin-core/). EnCore is appealing to developers of bioinformatics applications, because of the scope and variety of EnCore’s annotation resources, and because its collection of web services∗ utilizes a common standard format (EnXML: http://code.google.com/p/enfincore/wiki/wp1 encore enxml). Web services can communicate client applications using a variety of programming languages. Many bioinformaticians work in R, so an R-based solution is desirable. The ENVISIONQuery package provides programmatic access to the EnCore web services in R. EnCore’s capabilities evolve rapidly, so the architecture of ENVISIONQuery enables rapid integration of the new services when they appear.
Journal for ImmunoTherapy of Cancer, 2019
Journal for immunotherapy of cancer, Jan 9, 2018
Immune checkpoint inhibitors (ICIs) have changed the clinical management of melanoma. However, no... more Immune checkpoint inhibitors (ICIs) have changed the clinical management of melanoma. However, not all patients respond, and current biomarkers including PD-L1 and mutational burden show incomplete predictive performance. The clinical validity and utility of complex biomarkers have not been studied in melanoma. Cutaneous metastatic melanoma patients at eight institutions were evaluated for PD-L1 expression, CD8 T-cell infiltration pattern, mutational burden, and 394 immune transcript expression. PD-L1 IHC and mutational burden were assessed for association with overall survival (OS) in 94 patients treated prior to ICI approval by the FDA (historical-controls), and in 137 patients treated with ICIs. Unsupervised analysis revealed distinct immune-clusters with separate response rates. This comprehensive immune profiling data were then integrated to generate a continuous Response Score (RS) based upon response criteria (RECIST v.1.1). RS was developed using a single institution trainin...
British Journal of Cancer, 2005
Journal of pathology informatics, 2017
The University of Pittsburgh's Department of Biomedical Informatics and Division of Pathology... more The University of Pittsburgh's Department of Biomedical Informatics and Division of Pathology Informatics created a Science, Technology, Engineering, and Mathematics (STEM) pipeline in 2011 dedicated to providing cutting-edge informatics research and career preparatory experiences to a diverse group of highly motivated high-school students. In this third editorial installment describing the program, we provide a brief overview of the pipeline, report on achievements of the past scholars, and present results from self-reported assessments by the 2015 cohort of scholars. The pipeline continues to expand with the 2015 addition of the innovation internship, and the introduction of a program in 2016 aimed at offering first-time research experiences to undergraduates who are underrepresented in pathology and biomedical informatics. Achievements of program scholars include authorship of journal articles, symposium and summit presentations, and attendance at top 25 universities. All of ...
Amia Annual Symposium Proceedings Amia Symposium Amia Symposium, 2010
This study explored the possibility that semantic distance metrics can be used to develop methods... more This study explored the possibility that semantic distance metrics can be used to develop methods for auditing biomedical ontologies. We developed and tested an approach using the Foundational Model of Anatomy (FMA) and the body-structure taxonomy of SNOMED CT. We evaluated 190 class pairs in human anatomical structures using three semantic distance metrics: simple edge count, normalized path length, and information content. We applied principal component analysis (PCA) to study relationships between the semantic distance measurements so produced in FMA and SNOMED CT. We found that our application of PCA could detect significant discrepancies, but not necessarily outright mistakes, in the two ontologies. A review of discrepancies revealed that they often relate to multiple design perspectives employed in ontological definitions.
This is the first part of a two-parted report on development of a statistical learning algorithm ... more This is the first part of a two-parted report on development of a statistical learning algorithm for a latent variable model referred to as cooperative vector quantizer model. This part presents the theory and mathematical derivations of a variational Bayesian learning algorithm for the model. The model has general applications in the field of machine learning and signal processing. For example it can be used to solve the problem of blind source separation or image separation. Our special interest is in its potential biological application in that we can use the model to simulate signal transduction components regulating gene expression as latent variables. The algorithm is capable of automatically and efficiently determining the number of latent variables of the model, estimating the distribution of the parameters and latent variables. Thus, we can use the model to address following biological questions regarding gene expression regulation: (1) What are the key signal transduction components regulating gene expression in a given kind of cell; (2) How many key components are needed to efficiently encode information for gene expression regulation; (3) What are the states of the key components for a given gene expression data point. Such information will provide insight for understanding the mechanism of information organization of cells, mechanism of diseases and drug effect/toxicity. 2
Cancer Informatics, 2015
Data quality is a recognized problem for high-throughput genomics platforms, as evinced by the pr... more Data quality is a recognized problem for high-throughput genomics platforms, as evinced by the proliferation of methods attempting to filter out lower quality data points. Different filtering methods lead to discordant results, raising the question, which methods are best? Astonishingly, little computational support is offered to analysts to decide which filtering methods are optimal for the research question at hand. To evaluate them, we begin with a pair of expression data sets, transcriptomic and proteomic, on the same samples. The pair of data sets form a test-bed for the evaluation. Identifier mapping between the data sets creates a collection of feature pairs, with correlations calculated for each pair. To evaluate a filtering strategy, we estimate posterior probabilities for the correctness of probesets accepted by the method. An analyst can set expected utilities that represent the trade-off between the quality and quantity of accepted features. We tested nine published prob...
Journal of Clinical Oncology, Sep 1, 2000
This paper describes a comprehensive biomathematical cancer modeling facility, and its use in mee... more This paper describes a comprehensive biomathematical cancer modeling facility, and its use in meeting the challenges of developing better cancer treatment strategies by exploiting the cancer biology knowledge explosion thoroughly. Using information about the biology of cancer cells in the treatment of patients requires synthesizing this information to draw conclusions about the whole tumor/patient relationship, hopefully pointing to better treatments. Oncology Thinking Cap (OncoTCAP) does the required synthesis by providing both a Monte Carlo simulation engine and an analytic engine providing the joint probability generating function and therefore specifically the probability of cure. 1. INTRODUCTION In this paper we describe a cancer modeling facility, OncoTCAP, and how it can be used to develop better cancer treatment strategies. 1.1. Rapidly developing knowledge in cancer biology and treatment The ongoing enormous explosion of research providing detailed information about the ge...
Proceedings of the 1999 conference on Computer support for collaborative learning - CSCL '99, 1999