Extraction of Clinical Data from Electronic Health Records using Regular Expression (original) (raw)


The patients with cardiovascular diseases undergo complex medical procedures, including medical imaging, blood biochemistry analysis, physical examination, etc. The digitization of information is an intensive process in e-health therefore, in this paper there are presented some methods to extract important data from medical records of patients suffering from coronary artery diseases organized in PDF files.

Fast advancement in computerized information obtaining procedures have prompted immense volume of information extraction of text. Most of the data is composed of either unstructured or semi-structured form of text. To make this unstructured form of data into structured form using text mining, natural language process (NLP) techniques and machine learning algorithms are used. Cancer based text are in the form of Electronic Health Record (EHR/EMR) and there are tools to extract the text. Health care and clinical practice create a lot of content manifestations, test results, analyse, medicines, also, results for patients. This clinical content, reported in wellbeing records, is a potential wellspring of information and an underused asset for improved social insurance. To improve understanding consideration, information on demonstrative, prognostic, inclining, and medication reaction markers are fundamental. In this paper explored different text mining approaches using machine learning,...

Patients share key information about their health with medical practitioners during clinic consultations. These key information may include their past medications and allergies, current situations/issues, and expectations. The healthcare professionals store this information in an Electronic Medical Record (EMR). EMRs have empowered research in healthcare; information hidden in them if harnessed properly through Natural Language Processing (NLP) can be used for disease registries, drug safety, epidemic surveillance, disease prediction, and treatment. This work illustrates the application of NLP techniques to design and implement a Key Information Retrieval System (KIRS framework) using the Latent Dirichlet Allocation algorithm. The cross-industry standard process for data mining methodology was applied in an experiment with an EMR dataset from PubMed todemonstrate the framework. The new system extracted the common problems (ailments) and prescriptions across the five (5) countries pr...

Information Extraction (IE) is a natural language processing (NLP) task whose aim is to analyse texts written in natural language to extract structured and useful information such as named entities and semantic relations between them. Information extraction is an important task in a diverse set of applications like bio-medical literature mining, customer care, community websites, personal information management and so on. In this paper, the authors focus only on information extraction from clinical reports. The two most fundamental tasks in information extraction are discussed; namely, named entity recognition task and relation extraction task. The authors give details about the most used rule/pattern-based and machine learning techniques for each task. They also make comparisons between these techniques and summarize the advantages and disadvantages of each one.

This paper addresses the problem of extracting and processing relevant information from unstructured electronic documents of the biomedical domain. The documents are full scientific papers. This problem imposes several challenges, such as identifying text passages that contain relevant information, collecting the relevant information pieces, populating a database and a data warehouse, and mining these data. For this purpose, this paper proposes the IEDSS-Bio, an environment for Information Extraction and Decision Support System in Biomedical domain. In a case study, experiments with machine learning for identifying relevant text passages (disease and treatment effects, and patients number information on Sickle Cell Anemia papers) showed that the best results (95.9% accuracy) were obtained with a statistical method and the use of preprocessing techniques to resample the examples and to eliminate noise.