Classifying Encounter Notes in the Primary Care Patient Record (original) (raw)

Automatic assignment of diagnosis codes to free-form text medical note

JUCS - Journal of Universal Computer Science

International Classification of Disease (ICD) coding plays a significant role in classify-ing morbidity and mortality rates. Currently, ICD codes are assigned to a patient’s medical record by hand by medical practitioners or specialist clinical coders. This practice is prone to errors, and training skilled clinical coders requires time and human resources. Automatic prediction of ICD codes can help alleviate this burden. In this paper, we propose a transformer-based architecture with label-wise attention for predicting ICD codes on a medical dataset. The transformer model is first pre-trained from scratch on a medical dataset. Once this is done, the pre-trained model is used to generate representations of the tokens in the clinical documents, which are fed into the label-wise attention layer. Finally, the outputs from the label-wise attention layer are fed into a feed-forward neural network to predict appropriate ICD codes for the input document. We evaluate our model using hospital...

Automatic classification of medical reports, the CIREA project

Choosing a patient's reasons for staying in hospital amongst the 52, 000 pathology codes listed in the ICD-10 (International Classification of Diseases) requires that the practitioner spends a large amount of time keyboarding and searching, which may discourage him. However these codes are mandatory in many countries when the patient leaves the hospital, for biostatistical and administrative studies. The aim of the CIREA project is to propose an automatic ICD coding approach by mining textual medical reports. For that purpose we have proposed new algorithms such the EDA desuffixer, the CLO3 classification algorithm and the K-measure indicator.

From episodes of care to diagnosis codes: automatic text categorization for medico-economic encoding

AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, 2008

We report on the design and evaluation of an original system to help assignment ICD (International Classification of Disease) codes to clinical narratives. The task is defined as a multi-class multi-document classification task. We combine a set of machine learning and data-poor methods to generate a single automatic text categorizer, which returns a ranked list of ICD codes. The combined ranking system currently obtains a precision of 75% at high ranks and a recall of about 63% for the top twenty returned codes for a theoretical upper bound of about 79% (inter-coder agreement). The performance of the data-poor classifier is weak, whereas the use of tempo-rally-typed contents such as anamnesis and prescription free text sections results in a statistically significant improvement.

Automated classification of cardiology diagnoses based on textual medical reports

Anais do Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2020), 2020

Automatic diagnoses of diseases has been a long term challenge for Computer Science and related disciplines. Textual clinical reports can be used as a great source of data for such diagnoses. However, building classification models from them is not a trivial task. The problem tackled in this work is the identification of the medical diagnoses that are indicated in these reports. In the past, several methods have been proposed for addressing this problem, but a method developed for reports in the cardiology area that are written in Portuguese is still needed. In this paper we describe a method that is able to handle the peculiarities of clinical reports, including the medical terminology, and that is implemented to estimate correctly the disease based on raw clinical reports and a list of the possible diagnoses. Experimental results show that our method has a high degree of accuracy, even for infrequent classes and complex databases.

Classification of Free Text Clinical Narratives (Short Review)

2011

The paper is a limited review of publications (1995-2010) related to the problem of classification of clinical records presented in a free text form. The techniques of indexing and methods of classification are considered. We also pay special attention to the description of document sets used in the mentioned research. Finally, we conclude about the perspective research directions related with the topic.

Semi-supervised Automated Clinical Coding Using International Classification of Diseases

5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)., 2022

Clinical Text Notes (CTNs) contain physicians' reasoning process, written in an unstructured free text format, as they examine and interview patients. In recent years, several studies have been published that provide evidence for the utility of machine learning for predicting doctors' diagnoses from CTNs, a task known as ICD coding. Data annotation is time consuming, particularly when a degree of specialization is needed, as is the case for medical data. This paper presents a method of augmenting a sparsely annotated dataset of Icelandic CTNs with a machine-learned data imputation in a semi-supervised manner. We train a neural network on a small set of annotated CTNs and use it to extract clinical features from a set of un-annotated CTNs. These clinical features consist of answers to about a thousand potential questions that a physician might find the answers to during a consultation with a patient. The features are then used to train a classifier for the diagnosis of certain types of diseases. We report the results of an evaluation of this data augmentation method over three tiers of information that are available to a physician. Our data augmentation method shows a significant positive effect, which is diminished when an increasing number of clinical features, from the examination of the patient and diagnostics, are made available. Our method may be used for augmenting scarce datasets for systems that take decisions based on clinical features that do not include examinations or tests.