NCBI Taxonomy: a comprehensive update on curation, resources and tools - PubMed (original) (raw)

Review

. 2020 Jan 1:2020:baaa062.

doi: 10.1093/database/baaa062.

Stacy Ciufo 1, Mikhail Domrachev 1, Carol L Hotton 1, Sivakumar Kannan 1, Rogneda Khovanskaya 1, Detlef Leipe 1, Richard Mcveigh 1, Kathleen O'Neill 1, Barbara Robbertse 1, Shobha Sharma 1, Vladimir Soussov 1, John P Sullivan 1, Lu Sun 1, Seán Turner 1, Ilene Karsch-Mizrachi 1

Affiliations

PMID: 32761142
PMCID: PMC7408187
DOI: 10.1093/database/baaa062

Review

NCBI Taxonomy: a comprehensive update on curation, resources and tools

Conrad L Schoch et al. Database (Oxford). 2020.

Abstract

The National Center for Biotechnology Information (NCBI) Taxonomy includes organism names and classifications for every sequence in the nucleotide and protein sequence databases of the International Nucleotide Sequence Database Collaboration. Since the last review of this resource in 2012, it has undergone several improvements. Most notable is the shift from a single SQL database to a series of linked databases tied to a framework of data called NameBank. This means that relations among data elements can be adjusted in more detail, resulting in expanded annotation of synonyms, the ability to flag names with specific nomenclatural properties, enhanced tracking of publications tied to names and improved annotation of scientific authorities and types. Additionally, practices utilized by NCBI Taxonomy curators specific to major taxonomic groups are described, terms peculiar to NCBI Taxonomy are explained, external resources are acknowledged and updates to tools and other resources are documented. Database URL: https://www.ncbi.nlm.nih.gov/taxonomy.

Published by Oxford University Press 2020.

PubMed Disclaimer

Figures

Figure 1

Summarized flow of NCBI Taxonomy information.

Figure 2

Species names added over time to NCBI Taxonomy. The first occurrence of each species in the NCBI Taxonomy was determined by the created date of its associated TaxNode. This date represents the first addition of the species into the database irrespective of subsequent name changes.

Figure 3

Estimate of the percentage of formal species names missing from the public NCBI databases. Curves were generated by plotting the number of formal species in the NCBI Taxonomy against the running total of described species in the corresponding group by the end of the year. The IJSEM was used as the source for bacteria. The International Plant Names Index (IPNI; 27) was used as the source for the green plants. The Species 2000 Annual Checklist (46) was used as the source for invertebrates and Fungi. Vertebrate data were collected from the Catalogue of Fishes (21), Amphibian Species of the World (17), the Reptile Database (32), Avibase (19) and the American Society of Mammalogists (18). Archaea and viruses were omitted for having a small number of species and a specialized process for reporting new species, respectively.

Figure 4

Total number of names labeled as unpublished in NCBI Taxonomy, over time.

Figure 5

NCBI TaxBrowser example page.

Cited by

Evolution of g-type lysozymes in metazoa: insights into immunity and digestive adaptations.
Mukherjee K, Moroz LL. Mukherjee K, et al. Front Cell Dev Biol. 2024 Nov 6;12:1487920. doi: 10.3389/fcell.2024.1487920. eCollection 2024. Front Cell Dev Biol. 2024. PMID: 39568508 Free PMC article.
The development of a lateral flow immunochromatographic test strip for measurement of specific IgA and IgG antibodies level against porcine epidemic diarrhea virus in pig milk.
Jermsutjarit P, Venkateswaran D, Indrawattana N, Na Plord J, Tantituvanont A, Nilubol D. Jermsutjarit P, et al. Vet Q. 2024 Dec;44(1):1-15. doi: 10.1080/01652176.2024.2429472. Epub 2024 Nov 21. Vet Q. 2024. PMID: 39568374 Free PMC article.
Emergency wound site infection caused by Gulosibacter massiliensis: a case report.
Li W, Zhang R, Liu L, Wang C, Sun Y, Dai Y, Yang X, Lin S. Li W, et al. BMC Infect Dis. 2024 Nov 13;24(1):1291. doi: 10.1186/s12879-024-10187-5. BMC Infect Dis. 2024. PMID: 39538156 Free PMC article.
A catalogue of chromosome counts for Phylum Nematoda.
Blaxter ML, Leech C, Lunt DH. Blaxter ML, et al. Wellcome Open Res. 2024 Feb 19;9:55. doi: 10.12688/wellcomeopenres.20550.1. eCollection 2024. Wellcome Open Res. 2024. PMID: 39534537 Free PMC article.
Taxonomy Identifiers (TaxId) for Biodiversity Genomics: a guide to getting TaxId for submission of data to public databases.
Blaxter M, Pauperio J, Schoch C, Howe K. Blaxter M, et al. Wellcome Open Res. 2024 Oct 15;9:591. doi: 10.12688/wellcomeopenres.22949.1. eCollection 2024. Wellcome Open Res. 2024. PMID: 39526195 Free PMC article.

References

1. Karsch-Mizrachi I., Takagi T. and Cochrane G. (2018) The international nucleotide sequence database collaboration. Nucleic Acids Res., 46, D48–D51. - PMC - PubMed
1. Strasser B.J. (2008) GenBank—natural history in the 21st century? Science, 322, 537–538. - PubMed
1. Wilkinson M.D., Dumontier M., Aalbersberg I.J. et al. (2016) The FAIR guiding principles for scientific data management and stewardship. Sci. Data, 3, 160018. - PMC - PubMed
1. Schuler G.D., Epstein J.A., Ohkawa H. et al. (1996) Entrez: molecular biology database and retrieval system. Methods Enzymol., 266, 141–162. - PubMed
1. Federhen S. (2012) The NCBI taxonomy database. Nucleic Acids Res., 40, D136–D143. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database