Nkechinyere Agu | Rensselaer Polytechnic Institute (original) (raw)
Uploads
Papers by Nkechinyere Agu
arXiv (Cornell University), Jul 31, 2021
Despite the progress in automatic detection of radiologic findings from chest Xray (CXR) images i... more Despite the progress in automatic detection of radiologic findings from chest Xray (CXR) images in recent years, a quantitative evaluation of the explainability of these models is hampered by the lack of locally labeled datasets for different findings. With the exception of a few expert-labeled small-scale datasets for specific findings, such as pneumonia and pneumothorax, most of the CXR deep learning models to date are trained on global "weak" labels extracted from text reports, or trained via a joint image and unstructured text learning strategy. Inspired by the Visual Genome effort in the computer vision community, we constructed the first Chest ImaGenome dataset with a scene graph data structure to describe 242, 072 images. Local annotations are automatically produced using a joint rule-based natural language processing (NLP) and atlas-based bounding box detection pipeline. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-centered scene graph, useful for image-level reasoning and multimodal fusion applications. Overall, we provide: i) 1, 256 combinations of relation annotations between 29 CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph per image, ii) over 670, 000 localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as ii) a manually annotated gold standard scene graph dataset from 500 unique patients.
Journal of Biomedical Semantics
Background Clinical decision support systems have been widely deployed to guide healthcare decisi... more Background Clinical decision support systems have been widely deployed to guide healthcare decisions on patient diagnosis, treatment choices, and patient management through evidence-based recommendations. These recommendations are typically derived from clinical practice guidelines created by clinical specialties or healthcare organizations. Although there have been many different technical approaches to encoding guideline recommendations into decision support systems, much of the previous work has not focused on enabling system generated recommendations through the formalization of changes in a guideline, the provenance of a recommendation, and applicability of the evidence. Prior work indicates that healthcare providers may not find that guideline-derived recommendations always meet their needs for reasons such as lack of relevance, transparency, time pressure, and applicability to their clinical practice. Results We introduce several semantic techniques that model diseases based ...
We address the problem of modeling study populations in research studies in a declarative manner.... more We address the problem of modeling study populations in research studies in a declarative manner. Research studies often have a great degree of variability in the reporting of population descriptions. To make study populations easily accessible for decision making related to study applicability, we will show the usage of our ontology-enabled prototype system in different applications. Our system leverages our Study Cohort Ontology and the related cohort Knowledge Graph (as described in our accepted resource track paper). We aim to address three retrospective population analysis scenarios, designed to specifically determine the study match, study limitations, and evaluate the study quality. We also provide visualizations of a patient (or patient population) to a treatment arm. In addition, for each guideline recommendation that depends upon a study, we provide a summary of the relevant study’s cohort description. We describe some of our applications and their potential impacts. Resou...
In an ideal world, the evidence presented in a clinical guideline would cover all aspects of pati... more In an ideal world, the evidence presented in a clinical guideline would cover all aspects of patient care and would apply to all types of patients; however, in practice, this rarely is the case. Existing medical decision support systems are often simplistic, rule-based, and not easilyadaptable to changing literature or medical guidelines. We are exploring ways that we can enable clinical decision support systems with Semantic Web technologies that have the potential to support representation and linking to details in the related items in the scientific literature, and that can quickly adapt to changing information from the guidelines. In this paper, we present the ontologies and our semantic web-based tools aimed at trustworthy clinical decision support in three distinct areas: guideline representation and reasoning, guideline provenance, and study cohort modeling.
ArXiv, 2021
Despite the progress in automatic detection of radiologic findings from Chest X-Ray (CXR) images i... more Despite the progress in automatic detection of radiologic findings from Chest X-Ray (CXR) images in recent years, a quantitative evaluation of the explainability of these models is hampered by the lack of locally labeled datasets for different findings. With the exception of a few expert-labeled small-scale datasets for specific findings, such as pneumonia and pneumothorax, most of the CXR deep learning models to date are trained on global “weak” labels extracted from text reports, or trained via a joint image and unstructured text learning strategy. Inspired by the Visual Genome effort in the computer vision community, we constructed the first Chest ImaGenome dataset with a scene graph data structure to describe 242 , 072 images. Local annotations are automatically produced using a joint rule-based natural language processing (NLP) and atlas-based bounding box detection pipeline. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-cen...
We will demonstrate a reusable framework for developing knowledge graphs that supports general, o... more We will demonstrate a reusable framework for developing knowledge graphs that supports general, open-ended development of knowledge curation, interaction, and inference. Knowledge graphs need to be easily maintainable and usable in sometimes complex application settings. Often, scaling knowledge graph updates can require developing a knowledge curation pipeline that either replaces the graph wholesale whenever updates are made, or requires detailed tracking of knowledge provenance across multiple data sources. Fig. 1 shows how Whyis provides a semantic analysis ecosystem: an environment that supports research and development of semantic analytics for which we previously had to build custom applications [3,4]. Users interact through a suite of knowledge graph views driven by the node type and view requested in the URL. Knowledge curation methods include Semantic ETL, external linked data mapping,and Natural Language Processing (NLP). Autonomous inference agents expand the available k...
Lecture Notes in Computer Science, 2019
We present Whyis, the first framework for creating custom provenance-driven knowledge graphs. Why... more We present Whyis, the first framework for creating custom provenance-driven knowledge graphs. Whyis knowledge graphs are based on nanopublications, which simplifies and standardizes the production of structured, provenance-supported knowledge in knowledge graphs. To demonstrate Whyis, we created BioKG, a probabilistic biology knowledge graph, and populated it with well-used drug and protein content from DrugBank, Uniprot, and OBO Foundry ontologies. As shown with BioKG, knowledge graph developers can use Whyis to configure custom knowledge curation pipelines using data importers and semantic extract, transform, and load scripts. Whyis also contains a knowledge metaanalysis capability for use in customizable graph exploration. The flexible, nanopublication-based architecture of Whyis lets knowledge graph developers integrate, extend, and publish knowledge from heterogeneous sources on the web.
Providing provenance of treatment suggestions made by clinical decision support systems can enhan... more Providing provenance of treatment suggestions made by clinical decision support systems can enhance transparency and trust in these systems by healthcare practitioners. Provenance can aid in resolving ambiguity and conflicts between various guideline sources. We have developed a guideline provenance ontology, G-Prov, by extending existing provenance ontologies, to enable accurate encoding of the source of the reasoning rules that decision support systems rely on to generate diagnosis and treatment suggestions. Our ontology enables provenance representation at different granularity levels within guidelines. For instance, G-Prov can be used to annotate rules with citations found in evidence sentences as well as other sources of knowledge, such as figures and tables. Additionally, we have developed an application to show a range of use cases for our ontology. We demonstrate our work annotating recommendations in a CPG for Type-2 Diabetes and discuss how our approach could be used in va...
Treatment recommendations in clinical practice guidelines (CPG) are supported by evidence from re... more Treatment recommendations in clinical practice guidelines (CPG) are supported by evidence from research studiesthat utilize populations with highly selective sociodemographic and comorbid characteristics. When physicians are treating complicated patients, who do not wholly align with guideline recommendations, they need to determine the applicability of a study to their clinical population. We have designed the Study Cohort Ontology (SCO) and used it to build a knowledge graph (KG) exposing study populations in the CPGs published by the American Diabetes Association (ADA). I. FROM TABLE TO KNOWLEDGE GRAPH CPGs exhibit an implicit evidence model since guideline recommendations are based on evidence from clinical trials and observational case studies, referred to here as research studies. Our knowledge representation approach exposes descriptions of study populations, which are often reported in the first table of research studies, hence referred to as Table 1s. We analyzed research s...
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021
arXiv (Cornell University), Jul 31, 2021
Despite the progress in automatic detection of radiologic findings from chest Xray (CXR) images i... more Despite the progress in automatic detection of radiologic findings from chest Xray (CXR) images in recent years, a quantitative evaluation of the explainability of these models is hampered by the lack of locally labeled datasets for different findings. With the exception of a few expert-labeled small-scale datasets for specific findings, such as pneumonia and pneumothorax, most of the CXR deep learning models to date are trained on global "weak" labels extracted from text reports, or trained via a joint image and unstructured text learning strategy. Inspired by the Visual Genome effort in the computer vision community, we constructed the first Chest ImaGenome dataset with a scene graph data structure to describe 242, 072 images. Local annotations are automatically produced using a joint rule-based natural language processing (NLP) and atlas-based bounding box detection pipeline. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-centered scene graph, useful for image-level reasoning and multimodal fusion applications. Overall, we provide: i) 1, 256 combinations of relation annotations between 29 CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph per image, ii) over 670, 000 localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as ii) a manually annotated gold standard scene graph dataset from 500 unique patients.
Journal of Biomedical Semantics
Background Clinical decision support systems have been widely deployed to guide healthcare decisi... more Background Clinical decision support systems have been widely deployed to guide healthcare decisions on patient diagnosis, treatment choices, and patient management through evidence-based recommendations. These recommendations are typically derived from clinical practice guidelines created by clinical specialties or healthcare organizations. Although there have been many different technical approaches to encoding guideline recommendations into decision support systems, much of the previous work has not focused on enabling system generated recommendations through the formalization of changes in a guideline, the provenance of a recommendation, and applicability of the evidence. Prior work indicates that healthcare providers may not find that guideline-derived recommendations always meet their needs for reasons such as lack of relevance, transparency, time pressure, and applicability to their clinical practice. Results We introduce several semantic techniques that model diseases based ...
We address the problem of modeling study populations in research studies in a declarative manner.... more We address the problem of modeling study populations in research studies in a declarative manner. Research studies often have a great degree of variability in the reporting of population descriptions. To make study populations easily accessible for decision making related to study applicability, we will show the usage of our ontology-enabled prototype system in different applications. Our system leverages our Study Cohort Ontology and the related cohort Knowledge Graph (as described in our accepted resource track paper). We aim to address three retrospective population analysis scenarios, designed to specifically determine the study match, study limitations, and evaluate the study quality. We also provide visualizations of a patient (or patient population) to a treatment arm. In addition, for each guideline recommendation that depends upon a study, we provide a summary of the relevant study’s cohort description. We describe some of our applications and their potential impacts. Resou...
In an ideal world, the evidence presented in a clinical guideline would cover all aspects of pati... more In an ideal world, the evidence presented in a clinical guideline would cover all aspects of patient care and would apply to all types of patients; however, in practice, this rarely is the case. Existing medical decision support systems are often simplistic, rule-based, and not easilyadaptable to changing literature or medical guidelines. We are exploring ways that we can enable clinical decision support systems with Semantic Web technologies that have the potential to support representation and linking to details in the related items in the scientific literature, and that can quickly adapt to changing information from the guidelines. In this paper, we present the ontologies and our semantic web-based tools aimed at trustworthy clinical decision support in three distinct areas: guideline representation and reasoning, guideline provenance, and study cohort modeling.
ArXiv, 2021
Despite the progress in automatic detection of radiologic findings from Chest X-Ray (CXR) images i... more Despite the progress in automatic detection of radiologic findings from Chest X-Ray (CXR) images in recent years, a quantitative evaluation of the explainability of these models is hampered by the lack of locally labeled datasets for different findings. With the exception of a few expert-labeled small-scale datasets for specific findings, such as pneumonia and pneumothorax, most of the CXR deep learning models to date are trained on global “weak” labels extracted from text reports, or trained via a joint image and unstructured text learning strategy. Inspired by the Visual Genome effort in the computer vision community, we constructed the first Chest ImaGenome dataset with a scene graph data structure to describe 242 , 072 images. Local annotations are automatically produced using a joint rule-based natural language processing (NLP) and atlas-based bounding box detection pipeline. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-cen...
We will demonstrate a reusable framework for developing knowledge graphs that supports general, o... more We will demonstrate a reusable framework for developing knowledge graphs that supports general, open-ended development of knowledge curation, interaction, and inference. Knowledge graphs need to be easily maintainable and usable in sometimes complex application settings. Often, scaling knowledge graph updates can require developing a knowledge curation pipeline that either replaces the graph wholesale whenever updates are made, or requires detailed tracking of knowledge provenance across multiple data sources. Fig. 1 shows how Whyis provides a semantic analysis ecosystem: an environment that supports research and development of semantic analytics for which we previously had to build custom applications [3,4]. Users interact through a suite of knowledge graph views driven by the node type and view requested in the URL. Knowledge curation methods include Semantic ETL, external linked data mapping,and Natural Language Processing (NLP). Autonomous inference agents expand the available k...
Lecture Notes in Computer Science, 2019
We present Whyis, the first framework for creating custom provenance-driven knowledge graphs. Why... more We present Whyis, the first framework for creating custom provenance-driven knowledge graphs. Whyis knowledge graphs are based on nanopublications, which simplifies and standardizes the production of structured, provenance-supported knowledge in knowledge graphs. To demonstrate Whyis, we created BioKG, a probabilistic biology knowledge graph, and populated it with well-used drug and protein content from DrugBank, Uniprot, and OBO Foundry ontologies. As shown with BioKG, knowledge graph developers can use Whyis to configure custom knowledge curation pipelines using data importers and semantic extract, transform, and load scripts. Whyis also contains a knowledge metaanalysis capability for use in customizable graph exploration. The flexible, nanopublication-based architecture of Whyis lets knowledge graph developers integrate, extend, and publish knowledge from heterogeneous sources on the web.
Providing provenance of treatment suggestions made by clinical decision support systems can enhan... more Providing provenance of treatment suggestions made by clinical decision support systems can enhance transparency and trust in these systems by healthcare practitioners. Provenance can aid in resolving ambiguity and conflicts between various guideline sources. We have developed a guideline provenance ontology, G-Prov, by extending existing provenance ontologies, to enable accurate encoding of the source of the reasoning rules that decision support systems rely on to generate diagnosis and treatment suggestions. Our ontology enables provenance representation at different granularity levels within guidelines. For instance, G-Prov can be used to annotate rules with citations found in evidence sentences as well as other sources of knowledge, such as figures and tables. Additionally, we have developed an application to show a range of use cases for our ontology. We demonstrate our work annotating recommendations in a CPG for Type-2 Diabetes and discuss how our approach could be used in va...
Treatment recommendations in clinical practice guidelines (CPG) are supported by evidence from re... more Treatment recommendations in clinical practice guidelines (CPG) are supported by evidence from research studiesthat utilize populations with highly selective sociodemographic and comorbid characteristics. When physicians are treating complicated patients, who do not wholly align with guideline recommendations, they need to determine the applicability of a study to their clinical population. We have designed the Study Cohort Ontology (SCO) and used it to build a knowledge graph (KG) exposing study populations in the CPGs published by the American Diabetes Association (ADA). I. FROM TABLE TO KNOWLEDGE GRAPH CPGs exhibit an implicit evidence model since guideline recommendations are based on evidence from clinical trials and observational case studies, referred to here as research studies. Our knowledge representation approach exposes descriptions of study populations, which are often reported in the first table of research studies, hence referred to as Table 1s. We analyzed research s...
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021