M Musen | Stanford University (original) (raw)

Papers by M Musen

Research paper thumbnail of Optimize First, Buy Later: Analyzing Metrics to Ramp-Up Very Large Knowledge Bases

Lecture Notes in Computer Science, 2010

As knowledge bases move into the landscape of larger ontologies and have terabytes of related dat... more As knowledge bases move into the landscape of larger ontologies and have terabytes of related data, we must work on optimizing the performance of our tools. We are easily tempted to buy bigger machines or to fill rooms with armies of little ones to address the scalability problem. Yet, careful analysis and evaluation of the characteristics of our data-using metrics-often leads to dramatic improvements in performance. Firstly, are current scalable systems scalable enough? We found that for large or deep ontologies (some as large as 500,000 classes) it is hard to say because benchmarks obscure the load-time costs for materialization. Therefore, to expose those costs, we have synthesized a set of more representative ontologies. Secondly, in designing for scalability, how do we manage knowledge over time? By optimizing for data distribution and ontology evolution, we have reduced the population time, including materialization, for the NCBO Resource Index, a knowledge base of 16.4 billion annotations linking 2.4 million terms from 200 ontologies to 3.5 million data elements, from one week to less than one hour for one of the large datasets on the same machine.

Research paper thumbnail of Discussion of "biomedical ontologies: toward scientific debate

Methods of information in medicine, 2011

Research paper thumbnail of The ontology life cycle: Integrated tools for editing, publishing, peer review, and evolution of ontologies

AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, 2010

Ontologies have become a critical component of many applications in biomedical informatics. Howev... more Ontologies have become a critical component of many applications in biomedical informatics. However, the landscape of the ontology tools today is largely fragmented, with independent tools for ontology editing, publishing, and peer review: users develop an ontology in an ontology editor, such as Protégé; and publish it on a Web server or in an ontology library, such as BioPortal, in order to share it with the community; they use the tools provided by the library or mailing lists and bug trackers to collect feedback from users. In this paper, we present a set of tools that bring the ontology editing and publishing closer together, in an integrated platform for the entire ontology lifecycle. This integration streamlines the workflow for collaborative development and increases integration between the ontologies themselves through the reuse of terms.

Research paper thumbnail of Comparison of ontology-based semantic-similarity measures

AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, 2008

Semantic-similarity measures quantify concept similarities in a given ontology. Potential applica... more Semantic-similarity measures quantify concept similarities in a given ontology. Potential applications for these measures include search, data mining, and knowledge discovery in database or decision-support systems that utilize ontologies. To date, there have not been comparisons of the different semantic-similarity approaches on a single ontology. Such a comparison can offer insight on the validity of different approaches. We compared 3 approaches to semantic similarity-metrics (which rely on expert opinion, ontologies only, and information content) with 4 metrics applied to SNOMED-CT. We found that there was poor agreement among those metrics based on information content with the ontology only metric. The metric based only on the ontology structure correlated most with expert opinion. Our results suggest that metrics based on the ontology only may be preferable to information-content-based metrics, and point to the need for more research on validating the different approaches.

Research paper thumbnail of Knowledge Zone: a public repository of peer-reviewed biomedical ontologies

Studies in health technology and informatics, 2007

Reuse of ontologies is important for achieving better interoperability among health systems and r... more Reuse of ontologies is important for achieving better interoperability among health systems and relieving knowledge engineers from the burden of developing ontologies from scratch. Most of the work that aims to facilitate ontology reuse has focused on building ontology libraries that are simple repositories of ontologies or has led to keyword-based search tools that search among ontologies. To our knowledge, there are no operational methodologies that allow users to evaluate ontologies and to compare them in order to choose the most appropriate ontology for their task. In this paper, we present, Knowledge Zone - a Web-based portal that allows users to submit their ontologies, to associate metadata with their ontologies, to search for existing ontologies, to find ontology rankings based on user reviews, to post their own reviews, and to rate reviews.

Research paper thumbnail of Bioterrorism preparedness and response: use of information technologies and decision support systems

Evidence report/technology assessment (Summary), 2002

This report may be used, in whole or in part, as the basis for development of clinical practice g... more This report may be used, in whole or in part, as the basis for development of clinical practice guidelines and other quality enhancement tools, or as a basis for reimbursement and coverage policies. Endorsement by the Agency for Healthcare Research and Quality (AHRQ) or the U.S. Department of Health and Human Services (DHHS) of such derivative products may not be stated or implied.

Research paper thumbnail of Integrating a modern knowledge-based system architecture with a legacy VA database: the ATHENA and EON projects at Stanford

Proceedings / AMIA ... Annual Symposium. AMIA Symposium, 1999

We present a methodology and database mediator tool for integrating modern knowledge-based system... more We present a methodology and database mediator tool for integrating modern knowledge-based systems, such as the Stanford EON architecture for automated guideline-based decision-support, with legacy databases, such as the Veterans Health Information Systems & Technology Architecture (VISTA) systems, which are used nation-wide. Specifically, we discuss designs for database integration in ATHENA, a system for hypertension care based on EON, at the VA Palo Alto Health Care System. We describe a new database mediator that affords the EON system both physical and logical data independence from the legacy VA database. We found that to achieve our design goals, the mediator requires two separate mapping levels and must itself involve a knowledge-based component.

Research paper thumbnail of Of Brittleness and Bottlenecks: Challenges in the Creation of Pattern-Recognition and Expert-System Models

Machine Intelligence and Pattern Recognition, 1988

As tools for the construction of expert systems have become commonly available, workers in artifi... more As tools for the construction of expert systems have become commonly available, workers in artificial intelligence (Al) have begun to pay increasing attention to the problems of building and maintaining large knowledge bases.

Research paper thumbnail of An evaluation model for syndromic surveillance: assessing the performance of a temporal algorithm

MMWR. Morbidity and mortality weekly report, Jan 26, 2005

Syndromic surveillance offers the potential to rapidly detect outbreaks resulting from terrorism.... more Syndromic surveillance offers the potential to rapidly detect outbreaks resulting from terrorism. Despite considerable experience with implementing syndromic surveillance, limited evidence exists to describe the performance of syndromic surveillance systems in detecting outbreaks. To describe a model for simulating cases that might result from exposure to inhalational anthrax and then use the model to evaluate the ability of syndromic surveillance to detect an outbreak of inhalational anthrax after an aerosol release. Disease progression and health-care use were simulated for persons infected with anthrax. Simulated cases were then superimposed on authentic surveillance data to create test data sets. A temporal outbreak detection algorithm was applied to each test data set, and sensitivity and timeliness of outbreak detection were calculated by using syndromic surveillance. The earliest detection using a temporal algorithm was 2 days after a release. Earlier detection tended to occu...

Research paper thumbnail of SWRL-F

Proceedings of the International Conference on Web Intelligence, Mining and Semantics - WIMS '11, 2011

Enhancing Semantic Web technologies with ability to express uncertainty and imprecision is widely... more Enhancing Semantic Web technologies with ability to express uncertainty and imprecision is widely discussed topic. While SWRL can provide additional expressivity to OWL-based ontologies, it does not provide any way to handle uncertainty or imprecision. There is a pressing need to provide a standard-based, simple and functioning solution. We describe an extension of SWRL called SWRL-F that we believe can

Research paper thumbnail of Section V Expert Systems and Algorithms-Knowledge-Based Systems-A Declarative Explanation Framework That Uses a Collection of Visualization Agents

Research paper thumbnail of Gestion du multilinguisme dans un portail d’ontologies: étude de cas pour le NCBO BioPortal

Les terminologies et les ontologies jouent un rôle central en sciences de la vie pour structurer ... more Les terminologies et les ontologies jouent un rôle central en sciences de la vie pour structurer les données biomédicales et les rendre interopérables [2]. L'utilisation d'ontologies pour indexer et intégrer les ressources de données est un moyen de valoriser la connaissance en facilitant la recherche et la fouille de données. Cependant, les découvertes qui pourraient être réalisées sont souvent limitées par la disponibilité et le traitement des données dans une langue seulement, le plus souvent l'anglais, pour laquelle il existe le plus d'ontologies et d'outils. Dans le cadre du projet Indexation sémantique de ressources biomédicales francophones (SIFR -http://www.lirmm.fr/sifr), nous nous intéressons à la gestion du multilinguisme dans la plateforme BioPortal (http://bioportal.bioontology.org) du Centre National pour les Ontologies Biomédicales (NCBO). BioPortal [7] permet d'accéder, visualiser, rechercher et commenter plus de 350 ontologies ou

Research paper thumbnail of Ontology Versionning as an element of an ontology management framework

Research paper thumbnail of WebProtégé: a Web-based Development Environment for OWL 2 Ontologies

Research paper thumbnail of Policy brief on semantic interoperability

Research paper thumbnail of Sequential Usage Patterns in Collaborative Ontology-Engineering Projects

With the growing popularity of large-scale biomedical collaborative ontology-engineering projects... more With the growing popularity of large-scale biomedical collaborative ontology-engineering projects, such as the creation of the 11 th revision of the International Classification of Diseases, new methods and insights are needed to help project-and communitymanagers to cope with the constantly growing complexity of such projects. In this paper we present a novel application of Markov Chains on the change-logs of collaborative ontology-engineering projects to extract and analyze sequential patterns. This method also allows to investigate memory and structure in human activity patterns when collaboratively creating an ontology by leveraging Markov Chain models of varying orders. We describe all necessary steps for applying the methodology to collaborative ontologyengineering projects and provide first results for the International Classification of Diseases in its 11 th revision. Furthermore, we show that the collected sequential-patterns provide actionable information for community-and project-managers to monitor, coordinate and dynamically adapt to the natural development processes that occur when collaboratively engineering an ontology. We hope that the adaption of the presented methodology will spur a new line of ontology-development tools and evaluationtechniques, which concentrate on the interactive nature of the collaborative ontology-engineering process.

Research paper thumbnail of Method And System For Extraction And Normalization Of Relationships Via Ontology Induction

Research paper thumbnail of Frontiers in Artificial Intelligence and Applications

Research paper thumbnail of Collaborative RO1 with NCBO Semantics and Services Enabled Problem Solving Environment For Trypanosoma Cruzi

Research paper thumbnail of Derek Sleeman (University of Aberdeen)

Research paper thumbnail of Optimize First, Buy Later: Analyzing Metrics to Ramp-Up Very Large Knowledge Bases

Lecture Notes in Computer Science, 2010

As knowledge bases move into the landscape of larger ontologies and have terabytes of related dat... more As knowledge bases move into the landscape of larger ontologies and have terabytes of related data, we must work on optimizing the performance of our tools. We are easily tempted to buy bigger machines or to fill rooms with armies of little ones to address the scalability problem. Yet, careful analysis and evaluation of the characteristics of our data-using metrics-often leads to dramatic improvements in performance. Firstly, are current scalable systems scalable enough? We found that for large or deep ontologies (some as large as 500,000 classes) it is hard to say because benchmarks obscure the load-time costs for materialization. Therefore, to expose those costs, we have synthesized a set of more representative ontologies. Secondly, in designing for scalability, how do we manage knowledge over time? By optimizing for data distribution and ontology evolution, we have reduced the population time, including materialization, for the NCBO Resource Index, a knowledge base of 16.4 billion annotations linking 2.4 million terms from 200 ontologies to 3.5 million data elements, from one week to less than one hour for one of the large datasets on the same machine.

Research paper thumbnail of Discussion of "biomedical ontologies: toward scientific debate

Methods of information in medicine, 2011

Research paper thumbnail of The ontology life cycle: Integrated tools for editing, publishing, peer review, and evolution of ontologies

AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, 2010

Ontologies have become a critical component of many applications in biomedical informatics. Howev... more Ontologies have become a critical component of many applications in biomedical informatics. However, the landscape of the ontology tools today is largely fragmented, with independent tools for ontology editing, publishing, and peer review: users develop an ontology in an ontology editor, such as Protégé; and publish it on a Web server or in an ontology library, such as BioPortal, in order to share it with the community; they use the tools provided by the library or mailing lists and bug trackers to collect feedback from users. In this paper, we present a set of tools that bring the ontology editing and publishing closer together, in an integrated platform for the entire ontology lifecycle. This integration streamlines the workflow for collaborative development and increases integration between the ontologies themselves through the reuse of terms.

Research paper thumbnail of Comparison of ontology-based semantic-similarity measures

AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, 2008

Semantic-similarity measures quantify concept similarities in a given ontology. Potential applica... more Semantic-similarity measures quantify concept similarities in a given ontology. Potential applications for these measures include search, data mining, and knowledge discovery in database or decision-support systems that utilize ontologies. To date, there have not been comparisons of the different semantic-similarity approaches on a single ontology. Such a comparison can offer insight on the validity of different approaches. We compared 3 approaches to semantic similarity-metrics (which rely on expert opinion, ontologies only, and information content) with 4 metrics applied to SNOMED-CT. We found that there was poor agreement among those metrics based on information content with the ontology only metric. The metric based only on the ontology structure correlated most with expert opinion. Our results suggest that metrics based on the ontology only may be preferable to information-content-based metrics, and point to the need for more research on validating the different approaches.

Research paper thumbnail of Knowledge Zone: a public repository of peer-reviewed biomedical ontologies

Studies in health technology and informatics, 2007

Reuse of ontologies is important for achieving better interoperability among health systems and r... more Reuse of ontologies is important for achieving better interoperability among health systems and relieving knowledge engineers from the burden of developing ontologies from scratch. Most of the work that aims to facilitate ontology reuse has focused on building ontology libraries that are simple repositories of ontologies or has led to keyword-based search tools that search among ontologies. To our knowledge, there are no operational methodologies that allow users to evaluate ontologies and to compare them in order to choose the most appropriate ontology for their task. In this paper, we present, Knowledge Zone - a Web-based portal that allows users to submit their ontologies, to associate metadata with their ontologies, to search for existing ontologies, to find ontology rankings based on user reviews, to post their own reviews, and to rate reviews.

Research paper thumbnail of Bioterrorism preparedness and response: use of information technologies and decision support systems

Evidence report/technology assessment (Summary), 2002

This report may be used, in whole or in part, as the basis for development of clinical practice g... more This report may be used, in whole or in part, as the basis for development of clinical practice guidelines and other quality enhancement tools, or as a basis for reimbursement and coverage policies. Endorsement by the Agency for Healthcare Research and Quality (AHRQ) or the U.S. Department of Health and Human Services (DHHS) of such derivative products may not be stated or implied.

Research paper thumbnail of Integrating a modern knowledge-based system architecture with a legacy VA database: the ATHENA and EON projects at Stanford

Proceedings / AMIA ... Annual Symposium. AMIA Symposium, 1999

We present a methodology and database mediator tool for integrating modern knowledge-based system... more We present a methodology and database mediator tool for integrating modern knowledge-based systems, such as the Stanford EON architecture for automated guideline-based decision-support, with legacy databases, such as the Veterans Health Information Systems & Technology Architecture (VISTA) systems, which are used nation-wide. Specifically, we discuss designs for database integration in ATHENA, a system for hypertension care based on EON, at the VA Palo Alto Health Care System. We describe a new database mediator that affords the EON system both physical and logical data independence from the legacy VA database. We found that to achieve our design goals, the mediator requires two separate mapping levels and must itself involve a knowledge-based component.

Research paper thumbnail of Of Brittleness and Bottlenecks: Challenges in the Creation of Pattern-Recognition and Expert-System Models

Machine Intelligence and Pattern Recognition, 1988

As tools for the construction of expert systems have become commonly available, workers in artifi... more As tools for the construction of expert systems have become commonly available, workers in artificial intelligence (Al) have begun to pay increasing attention to the problems of building and maintaining large knowledge bases.

Research paper thumbnail of An evaluation model for syndromic surveillance: assessing the performance of a temporal algorithm

MMWR. Morbidity and mortality weekly report, Jan 26, 2005

Syndromic surveillance offers the potential to rapidly detect outbreaks resulting from terrorism.... more Syndromic surveillance offers the potential to rapidly detect outbreaks resulting from terrorism. Despite considerable experience with implementing syndromic surveillance, limited evidence exists to describe the performance of syndromic surveillance systems in detecting outbreaks. To describe a model for simulating cases that might result from exposure to inhalational anthrax and then use the model to evaluate the ability of syndromic surveillance to detect an outbreak of inhalational anthrax after an aerosol release. Disease progression and health-care use were simulated for persons infected with anthrax. Simulated cases were then superimposed on authentic surveillance data to create test data sets. A temporal outbreak detection algorithm was applied to each test data set, and sensitivity and timeliness of outbreak detection were calculated by using syndromic surveillance. The earliest detection using a temporal algorithm was 2 days after a release. Earlier detection tended to occu...

Research paper thumbnail of SWRL-F

Proceedings of the International Conference on Web Intelligence, Mining and Semantics - WIMS '11, 2011

Enhancing Semantic Web technologies with ability to express uncertainty and imprecision is widely... more Enhancing Semantic Web technologies with ability to express uncertainty and imprecision is widely discussed topic. While SWRL can provide additional expressivity to OWL-based ontologies, it does not provide any way to handle uncertainty or imprecision. There is a pressing need to provide a standard-based, simple and functioning solution. We describe an extension of SWRL called SWRL-F that we believe can

Research paper thumbnail of Section V Expert Systems and Algorithms-Knowledge-Based Systems-A Declarative Explanation Framework That Uses a Collection of Visualization Agents

Research paper thumbnail of Gestion du multilinguisme dans un portail d’ontologies: étude de cas pour le NCBO BioPortal

Les terminologies et les ontologies jouent un rôle central en sciences de la vie pour structurer ... more Les terminologies et les ontologies jouent un rôle central en sciences de la vie pour structurer les données biomédicales et les rendre interopérables [2]. L'utilisation d'ontologies pour indexer et intégrer les ressources de données est un moyen de valoriser la connaissance en facilitant la recherche et la fouille de données. Cependant, les découvertes qui pourraient être réalisées sont souvent limitées par la disponibilité et le traitement des données dans une langue seulement, le plus souvent l'anglais, pour laquelle il existe le plus d'ontologies et d'outils. Dans le cadre du projet Indexation sémantique de ressources biomédicales francophones (SIFR -http://www.lirmm.fr/sifr), nous nous intéressons à la gestion du multilinguisme dans la plateforme BioPortal (http://bioportal.bioontology.org) du Centre National pour les Ontologies Biomédicales (NCBO). BioPortal [7] permet d'accéder, visualiser, rechercher et commenter plus de 350 ontologies ou

Research paper thumbnail of Ontology Versionning as an element of an ontology management framework

Research paper thumbnail of WebProtégé: a Web-based Development Environment for OWL 2 Ontologies

Research paper thumbnail of Policy brief on semantic interoperability

Research paper thumbnail of Sequential Usage Patterns in Collaborative Ontology-Engineering Projects

With the growing popularity of large-scale biomedical collaborative ontology-engineering projects... more With the growing popularity of large-scale biomedical collaborative ontology-engineering projects, such as the creation of the 11 th revision of the International Classification of Diseases, new methods and insights are needed to help project-and communitymanagers to cope with the constantly growing complexity of such projects. In this paper we present a novel application of Markov Chains on the change-logs of collaborative ontology-engineering projects to extract and analyze sequential patterns. This method also allows to investigate memory and structure in human activity patterns when collaboratively creating an ontology by leveraging Markov Chain models of varying orders. We describe all necessary steps for applying the methodology to collaborative ontologyengineering projects and provide first results for the International Classification of Diseases in its 11 th revision. Furthermore, we show that the collected sequential-patterns provide actionable information for community-and project-managers to monitor, coordinate and dynamically adapt to the natural development processes that occur when collaboratively engineering an ontology. We hope that the adaption of the presented methodology will spur a new line of ontology-development tools and evaluationtechniques, which concentrate on the interactive nature of the collaborative ontology-engineering process.

Research paper thumbnail of Method And System For Extraction And Normalization Of Relationships Via Ontology Induction

Research paper thumbnail of Frontiers in Artificial Intelligence and Applications

Research paper thumbnail of Collaborative RO1 with NCBO Semantics and Services Enabled Problem Solving Environment For Trypanosoma Cruzi

Research paper thumbnail of Derek Sleeman (University of Aberdeen)