Finding useful data across multiple biomedical data repositories using DataMed (original) (raw)

Nature Genetics volume 49, pages 816–819 (2017)Cite this article

Subjects

Abstract

The value of broadening searches for data across multiple repositories has been identified by the biomedical research community. As part of the US National Institutes of Health (NIH) Big Data to Knowledge initiative, we work with an international community of researchers, service providers and knowledge experts to develop and test a data index and search engine, which are based on metadata extracted from various data sets in a range of repositories. DataMed is designed to be, for data, what PubMed has been for the scientific literature. DataMed supports the findability and accessibility of data sets. These characteristics—along with interoperability and reusability—compose the four FAIR principles to facilitate knowledge discovery in today's big data–intensive science landscape.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

$32.99 / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Additional access options:

References

  1. Wilkinson, M.D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
    Article Google Scholar
  2. Collins, F.S. & Tabak, L.A. Policy: NIH plans to enhance reproducibility. Nature 505, 612–613 (2014).
    Article Google Scholar
  3. Bourne, P.E. et al. The NIH Big Data to Knowledge (BD2K) initiative. J. Am. Med. Inform. Assoc. 22, 1114 (2015).
    Article Google Scholar
  4. Lu, Z. PubMed and beyond: a survey of web tools for searching biomedical literature. Database (Oxford) 2011, baq036 (2011).
    Article Google Scholar
  5. Sansone, S.-A. et al. DATS: the data tag suite to enable discoverability of datasets. Sci. Data 4, 170059 (2017).
    Article Google Scholar
  6. Noruzi, A. Google Scholar: the new generation of citation indexes. Libri 55, 170–180 (2005).
    Article Google Scholar
  7. Hands, A. Microsoft Academic Search—http://academic.research.microsoft.com. Tech. Serv. Q. 29, 251–252 (2012).
    Article Google Scholar
  8. Kejariwal, D. & Mahawar, K.K. Is your journal indexed in PubMed? Relevance of PubMed in biomedical scientific literature today. WebmedCentral MISCELLANEOUS 3, WMC003159 (2012).
    Google Scholar
  9. Huh, S. Journal Article Tag Suite 1.0: National Information Standards Organization standard of journal extensible markup language. Sci. Ed. 1, 99–104 (2014).
    Article Google Scholar
  10. Perez-Riverol, Y. et al. Nat. Biotechnol. 35, 406–409 (2017)
    Article CAS Google Scholar
  11. Brase, J. in 2009 Fourth International Conference on Cooperation and Promotion of Information Resources in Science and Technology (COINFO 2009) 257–261 (IEEE, 2009).
    Google Scholar
  12. Chodorow, K. MongoDB: The Definitive Guide (O'Reilly Media, 2013).
    Google Scholar
  13. Kuć, R. & Rogozinski, M. ElasticSearch Server (Packt Publishing, 2016).
    Google Scholar
  14. Coll, I.S. & Cruz, J.M.B. Open archives initiative. Protocol for metadata harvesting (OAI-PMH): descripción, funciones y aplicaciones de un protocolo. Prof. Inf. 12, 99–106 (2003).
    Google Scholar
  15. Richardson, L. & Ruby, S. RESTful Web Services (O'Reilly Media, 2008).
    Google Scholar
  16. Westbrook, J., Ito, N., Nakamura, H., Henrick, K. & Berman, H.M. PDBML: the representation of archival macromolecular structure data in XML. Bioinformatics 21, 988–992 (2005).
    Article CAS Google Scholar
  17. Kiryakov, A., Popov, B., Terziev, I. Manov, D. & Ognyanoff, D. Semantic annotation, indexing, and retrieval. Web Semantics 2, 49–79 (2004).
    Article Google Scholar
  18. Haustein, S., Peters, I., Sugimoto, C.R., Thelwall, M. & Larivière, V. Tweeting biomedicine: an analysis of tweets and citations in the biomedical literature. J. Assoc. Inf. Sci. Technol. 65, 656–669 (2014).
    Article Google Scholar

Download references

Acknowledgements

This project is funded by grant U24AI117966 from NIAID, NIH, as part of the BD2K program. The co-authors, who are the lead investigators and chairs/co-chairs of the core activities, thank all contributors to the bioCADDIE consortium and list them in the Supplementary Note in alphabetical order within each activity group (each name appears only once even though many people participated in different activities).

Author information

Author notes

  1. Lucila Ohno-Machado, Susanna-Assunta Sansone, George Alter, Ian Fore, Jeffrey Grethe, Hua Xu and Hyeon-eui Kim: These authors contributed equally to this work.

Authors and Affiliations

  1. Health System Department of Biomedical Informatics, University of California, San Diego, La Jolla, California, USA
    Lucila Ohno-Machado, Elizabeth Bell, Nansu Zong & Hyeon-eui Kim
  2. Veterans Administration San Diego Healthcare System, San Diego, California, USA
    Lucila Ohno-Machado
  3. e-Research Centre, University of Oxford, Oxford, UK
    Susanna-Assunta Sansone, Alejandra Gonzalez-Beltran & Philippe Rocca-Serra
  4. Department of History and Inter-University Consortium for Political and Social Research (ICPSR), Institute for Social Research, University of Michigan, Ann Arbor, Michigan, USA
    George Alter
  5. US National Institutes of Health, Bethesda, Maryland, USA
    Ian Fore
  6. Department of Neurosciences, University of California, San Diego, La Jolla, California, USA
    Jeffrey Grethe
  7. School of Biomedical Informatics,University of Texas Health Science Center at Houston, Houston, Texas, USA
    Hua Xu, Anupama E Gururaj & Ergin Soysal

Authors

  1. Lucila Ohno-Machado
  2. Susanna-Assunta Sansone
  3. George Alter
  4. Ian Fore
  5. Jeffrey Grethe
  6. Hua Xu
  7. Alejandra Gonzalez-Beltran
  8. Philippe Rocca-Serra
  9. Anupama E Gururaj
  10. Elizabeth Bell
  11. Ergin Soysal
  12. Nansu Zong
  13. Hyeon-eui Kim

Corresponding author

Correspondence toLucila Ohno-Machado.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Rights and permissions

About this article

Cite this article

Ohno-Machado, L., Sansone, SA., Alter, G. et al. Finding useful data across multiple biomedical data repositories using DataMed.Nat Genet 49, 816–819 (2017). https://doi.org/10.1038/ng.3864

Download citation

This article is cited by

Associated content