Finding useful data across multiple biomedical data repositories using DataMed (original) (raw)
- Commentary
- Published: 26 May 2017
- Susanna-Assunta Sansone ORCID: orcid.org/0000-0001-5306-56903 na1,
- George Alter4 na1,
- Ian Fore ORCID: orcid.org/0000-0002-2926-93245 na1,
- Jeffrey Grethe ORCID: orcid.org/0000-0001-5212-70526 na1,
- Hua Xu7 na1,
- Alejandra Gonzalez-Beltran3,
- Philippe Rocca-Serra3,
- Anupama E Gururaj7,
- Elizabeth Bell1,
- Ergin Soysal ORCID: orcid.org/0000-0002-2107-05807,
- Nansu Zong1 &
- …
- Hyeon-eui Kim1 na1
Nature Genetics volume 49, pages 816–819 (2017)Cite this article
- 4269 Accesses
- 93 Citations
- 53 Altmetric
- Metrics details
Subjects
Abstract
The value of broadening searches for data across multiple repositories has been identified by the biomedical research community. As part of the US National Institutes of Health (NIH) Big Data to Knowledge initiative, we work with an international community of researchers, service providers and knowledge experts to develop and test a data index and search engine, which are based on metadata extracted from various data sets in a range of repositories. DataMed is designed to be, for data, what PubMed has been for the scientific literature. DataMed supports the findability and accessibility of data sets. These characteristics—along with interoperability and reusability—compose the four FAIR principles to facilitate knowledge discovery in today's big data–intensive science landscape.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout
Additional access options:
References
- Wilkinson, M.D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Article Google Scholar - Collins, F.S. & Tabak, L.A. Policy: NIH plans to enhance reproducibility. Nature 505, 612–613 (2014).
Article Google Scholar - Bourne, P.E. et al. The NIH Big Data to Knowledge (BD2K) initiative. J. Am. Med. Inform. Assoc. 22, 1114 (2015).
Article Google Scholar - Lu, Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford) 2011, baq036 (2011).
Article Google Scholar - Sansone, S.-A. et al. DATS: the data tag suite to enable discoverability of datasets. Sci. Data 4, 170059 (2017).
Article Google Scholar - Noruzi, A. Google Scholar: the new generation of citation indexes. Libri 55, 170–180 (2005).
Article Google Scholar - Hands, A. Microsoft Academic Search—http://academic.research.microsoft.com. Tech. Serv. Q. 29, 251–252 (2012).
Article Google Scholar - Kejariwal, D. & Mahawar, K.K. Is your journal indexed in PubMed? Relevance of PubMed in biomedical scientific literature today. WebmedCentral MISCELLANEOUS 3, WMC003159 (2012).
Google Scholar - Huh, S. Journal Article Tag Suite 1.0: National Information Standards Organization standard of journal extensible markup language. Sci. Ed. 1, 99–104 (2014).
Article Google Scholar - Perez-Riverol, Y. et al. Nat. Biotechnol. 35, 406–409 (2017)
Article CAS Google Scholar - Brase, J. in 2009 Fourth International Conference on Cooperation and Promotion of Information Resources in Science and Technology (COINFO 2009) 257–261 (IEEE, 2009).
Google Scholar - Chodorow, K. MongoDB: The Definitive Guide (O'Reilly Media, 2013).
Google Scholar - Kuć, R. & Rogozinski, M. ElasticSearch Server (Packt Publishing, 2016).
Google Scholar - Coll, I.S. & Cruz, J.M.B. Open archives initiative. Protocol for metadata harvesting (OAI-PMH): descripción, funciones y aplicaciones de un protocolo. Prof. Inf. 12, 99–106 (2003).
Google Scholar - Richardson, L. & Ruby, S. RESTful Web Services (O'Reilly Media, 2008).
Google Scholar - Westbrook, J., Ito, N., Nakamura, H., Henrick, K. & Berman, H.M. PDBML: the representation of archival macromolecular structure data in XML. Bioinformatics 21, 988–992 (2005).
Article CAS Google Scholar - Kiryakov, A., Popov, B., Terziev, I. Manov, D. & Ognyanoff, D. Semantic annotation, indexing, and retrieval. Web Semantics 2, 49–79 (2004).
Article Google Scholar - Haustein, S., Peters, I., Sugimoto, C.R., Thelwall, M. & Larivière, V. Tweeting biomedicine: an analysis of tweets and citations in the biomedical literature. J. Assoc. Inf. Sci. Technol. 65, 656–669 (2014).
Article Google Scholar
Acknowledgements
This project is funded by grant U24AI117966 from NIAID, NIH, as part of the BD2K program. The co-authors, who are the lead investigators and chairs/co-chairs of the core activities, thank all contributors to the bioCADDIE consortium and list them in the Supplementary Note in alphabetical order within each activity group (each name appears only once even though many people participated in different activities).
Author information
Author notes
- Lucila Ohno-Machado, Susanna-Assunta Sansone, George Alter, Ian Fore, Jeffrey Grethe, Hua Xu and Hyeon-eui Kim: These authors contributed equally to this work.
Authors and Affiliations
- Health System Department of Biomedical Informatics, University of California, San Diego, La Jolla, California, USA
Lucila Ohno-Machado, Elizabeth Bell, Nansu Zong & Hyeon-eui Kim - Veterans Administration San Diego Healthcare System, San Diego, California, USA
Lucila Ohno-Machado - e-Research Centre, University of Oxford, Oxford, UK
Susanna-Assunta Sansone, Alejandra Gonzalez-Beltran & Philippe Rocca-Serra - Department of History and Inter-University Consortium for Political and Social Research (ICPSR), Institute for Social Research, University of Michigan, Ann Arbor, Michigan, USA
George Alter - US National Institutes of Health, Bethesda, Maryland, USA
Ian Fore - Department of Neurosciences, University of California, San Diego, La Jolla, California, USA
Jeffrey Grethe - School of Biomedical Informatics,University of Texas Health Science Center at Houston, Houston, Texas, USA
Hua Xu, Anupama E Gururaj & Ergin Soysal
Authors
- Lucila Ohno-Machado
- Susanna-Assunta Sansone
- George Alter
- Ian Fore
- Jeffrey Grethe
- Hua Xu
- Alejandra Gonzalez-Beltran
- Philippe Rocca-Serra
- Anupama E Gururaj
- Elizabeth Bell
- Ergin Soysal
- Nansu Zong
- Hyeon-eui Kim
Corresponding author
Correspondence toLucila Ohno-Machado.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Ohno-Machado, L., Sansone, SA., Alter, G. et al. Finding useful data across multiple biomedical data repositories using DataMed.Nat Genet 49, 816–819 (2017). https://doi.org/10.1038/ng.3864
- Published: 26 May 2017
- Issue date: June 2017
- DOI: https://doi.org/10.1038/ng.3864