Extraction of Clinical Data from Electronic Health Records using Regular Expression (original) (raw)

Abstract

sparkles

The paper presents an automated method for extracting clinical data from Electronic Health Records (EHR) using regular expressions in Python. The method achieved high precision and sensitivity for structured data extraction compared to traditional methods. It highlights the challenges faced in managing unstructured clinical data and advocates for the use of advanced techniques like regex for efficient data extraction to support patient care and research.

Key takeaways

sparkles

The automated data extraction achieved 95% accuracy and 86% precision for structured data.
Extracted data can support secondary applications like disease prediction and prognosis.
The study processed data from 38,300,000 records from a prior big data extraction study.
The sensitivity for unstructured data extraction was measured at 85%.
The text mining applications provide accessible data formats for over 38,000 patients.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (13)

Appelt, D. E., Hobbs, J. R., Bear, J., Israel, D., & Tyson, M. FASTUS: A finite-state processor for information extraction from real-world text. In IJCAI 1993, 93, 1172-1178.
Miller, S., Crystal, M., Fox, H., Ramshaw, L., Schwartz, R., Stone, R., & Weischedel, R.. BBN: Description of the SIFT system as used for MUC-7. In Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29-May 1, 1998.
Sridevi, M., & Arunkumar, B. R. Information Extraction from Clinical Text using NLP and Machine Learning: Issues and Opportunities. In National Conference on ‚Recent Trends in Information Technology‛(NCRTIT), International Journal of Computer Applications (0975-8887) 2016.
Savova, G. K., Masanz, J. J., Ogren, P. V., Zheng, J., Sohn, S., Kipper-Schuler, K. C., & Chute, C. G. (2010). Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association, 17(5), 507-513.
Liu, H., Bielinski, S. J., Sohn, S., Murphy, S., Wagholikar, K. B., Jonnalagadda, S. R.,& Chute, C. G. (2013). An information extraction framework for cohort identification using electronic health records. AMIA Summits on Translational Science Proceedings, 2013, 149.
Sondhi, P., Gupta, M., Zhai, C., & Hockenmaier, J. (2010, August). Shallow information extraction from medical forum data. In Coling 2010: Posters (pp. 1158-1166).
Uzuner, Ö., Solti, I., & Cadag, E. Extracting medication information from clinical text. Journal of the American Medical Informatics Association, 2010:17(5), 514-518.
Bae I, Kim JS. A refinement system for medical information extraction from text-based bilingual electronic medical records. J Korean Soc Med Inform. 2008;14(3):267-274.
Park YT, Lee YT, Jo EC. Constructing a real-time prescription drug monitoring system. Healthc Inform Res. 2016;22(3):178-185
Glavaš, G. TAKELAB: medical information extraction and linking with MINERAL. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015) (pp. 389-393).
Kraus, S., Blake, C., & West, S. L. Information extraction from medical notes. Medinfo, 2007: 1-2.
Boytcheva, S., Angelova, G., Angelov, Z., & Tcharaktchiev, D. (2015). Text mining and big data analytics for retrospective analysis of clinical texts from outpatient care. Cybernetics and Information Technologies, 15(4), 58-77.
Jonnagaddala J, Liaw ST, Ray P, Kumar M, Dai HJ, Hsu CY. Identification and Progression of Heart Disease Risk Factors in Diabetic Patients from Longitudinal Electronic Health Records. Biomed Res Int. 2015;2015:636371.