Logical development of the cell ontology - PubMed (original) (raw)

Logical development of the cell ontology

Terrence F Meehan et al. BMC Bioinformatics. 2011.

Abstract

Background: The Cell Ontology (CL) is an ontology for the representation of in vivo cell types. As biological ontologies such as the CL grow in complexity, they become increasingly difficult to use and maintain. By making the information in the ontology computable, we can use automated reasoners to detect errors and assist with classification. Here we report on the generation of computable definitions for the hematopoietic cell types in the CL.

Results: Computable definitions for over 340 CL classes have been created using a genus-differentia approach. These define cell types according to multiple axes of classification such as the protein complexes found on the surface of a cell type, the biological processes participated in by a cell type, or the phenotypic characteristics associated with a cell type. We employed automated reasoners to verify the ontology and to reveal mistakes in manual curation. The implementation of this process exposed areas in the ontology where new cell type classes were needed to accommodate species-specific expression of cellular markers. Our use of reasoners also inferred new relationships within the CL, and between the CL and the contributing ontologies. This restructured ontology can be used to identify immune cells by flow cytometry, supports sophisticated biological queries involving cells, and helps generate new hypotheses about cell function based on similarities to other cell types.

Conclusion: Use of computable definitions enhances the development of the CL and supports the interoperability of OBO ontologies.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Ontologies and relationships used in computable definitions. (a) Ovals indicate the OBO Foundry ontologies whose classes are used in computable definitions with the CL. Arrows indicate the relationships used with a given ontology. The Protein Ontology and the cellular component of the Gene Ontology both use the 6 relationships describing expression. (b) Graphical representation of how cross product classes are used to define a natural killer cell type.

Figure 2

Figure 2

Need for new cell types indicated during generation of computable definitions. (a) The CL class "hematopoietic stem cell" and its computable definitions with the PRO ontology. New is_a subtypes of hematopoietic stem cell are created to represent species-specific expression of protein types. Abbreviations for protein types are used for clarity. (b) General hematopoietic progenitor cell types with characteristics shared by many species appear in the middle. Sub-types of progenitor cells classified by human-specific cell surface markers appear on the left and sub-types with mouse-specific cell surface markers appear on the right. Note that a developmental lineage of species-specific cell types can be determined by following the develops_from relationships.

Figure 3

Figure 3

Computable definitions allow automated reasoners to infer new relationships. (a) Before reasoning, the class "helper T cell" on the right of the graph has no is_a sub-types. (b) Representation of the same entities after reasoning over the ontology. "Helper T cell" has numerous implied is_a sub-types based on its computable definition "helper T cell is_a T cell capable_of cytokine secretion".

Figure 4

Figure 4

Inferred relationships provide clarity within CL. (a) A large number of classes have to be viewed to determine mature basophils and mature eosinophils develop from the same progenitor cell type in the asserted (i.e. unreasoned) hierarchy. D = develops_from relationship. (b) Inferred relationships by automated reasoning represent in a simple manner the shared progenitor cell type that mature basophils and mature eosinophils develop from. Dashed lines indicate inferred relationships.

Figure 5

Figure 5

Errors in manual curation are discovered with the use of automated reasoners and disjointness statements. (a) Gamma-delta T cell type is inferred to by a sub-type of alpha-beta T cell, a violation of the disjointness relationship asserted between the two cell types (not shown). The inferred is_a relationship results from the too general cross product class "alpha-beta T cell is_a T cell that has_plasma_membrane_part T cell receptor complex". (b) Corrected version of the ontology where alpha-beta T cell is described as a "T cell that has_plasma_membrane_part alpha-beta T cell receptor". ρ = has_plasma_membrane_part

Figure 6

Figure 6

Unexpected inferred relationships. (a) Natural T-regulatory cell (nTreg) is inferred to be a type of induced regulatory T cell (iTreg). Red zig-zag line represents a redundant develops_from relationship as the reasoner infers both regulatory T cell (Treg) classes ultimately develops_from the same double-positive thymocyte class. (b) By using the develops_from relationship in the computable definition, Treg classes are defined by the type of class they directly develop_from. nTregs are no longer inferred to be an is_a sub-type of iTreg. (c) Mature NK T cell is inferred to be a sub-type of mucosal invariant T cell (MAIT). (d) Addition of an "innate effector T cell" class and refining of computable definitions for NK T cells and MAIT cells leads to new grouping of cell types by the reasoner. C = capable_of, Green = GO-biological process, Teal = GO-molecular function.

References

    1. The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res. 2010;38:D331–335. doi: 10.1093/nar/gkp1018. - DOI - PMC - PubMed
    1. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, Leontis N, Rocca-Serra P, Ruttenberg A, Sansone S, Scheuermann RH, Shah N, Whetzel PL, Lewis S. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol. 2007;25:1251–1255. doi: 10.1038/nbt1346. - DOI - PMC - PubMed
    1. Bard J, Rhee SY, Ashburner M. An ontology for cell types. Genome Biol. 2005;6:R21. doi: 10.1186/gb-2005-6-2-r21. - DOI - PMC - PubMed
    1. Rector AL. Proceedings of the 2nd international conference on Knowledge capture. Vol. 2003. Sanibel Island, FL, USA: ACM; Modularisation of domain ontologies implemented in description logics and related formalisms including OWL; pp. 121–128.http://portal.acm.org/citation.cfm?id=945664 Accessed 19 August 2010.
    1. Rector A. Defaults, context, and knowledge: alternatives for OWL-indexed knowledge bases. Pac Symp Biocomput. 2004. pp. 226–237. - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources