Interactive Software for Generation and Visualization of Structured Findings in Radiology Reports (original) (raw)

Natural Language Processing in Radiology Reports

TMH Press, 2021

In the clinical domain, we have so much data, of which a lot of textual entries are unstructured free text. Nevertheless, it is of no use in its existing form. The majority of free-text clinical data in the electronic medical records remain unusable. The problem with this data is captured by the 5V's- Volume (quantity), Variety (format), Velocity(increasing), Value(richness), Veracity (quality & integrity).

Automatic Structuring of Radiology Free-Text Reports1

RadioGraphics, 2001

A natural language processor was developed that automatically structures the important medical information (eg, the existence, properties, location, and diagnostic interpretation of findings) contained in a radiology free-text document as a formal information model that can be interpreted by a computer program. The input to the system is a freetext report from a radiologic study. The system requires no reporting style changes on the part of the radiologist. Statistical and machine learning methods are used extensively throughout the system. A graphical user interface has been developed that allows the creation of handtagged training examples. Various aspects of the difficult problem of implementing an automated structured reporting system have been addressed, and the relevant technology is progressing well. Extensible Markup Language is emerging as the preferred syntactic standard for representing and distributing these structured reports within a clinical environment. Early successes hold out hope that similar statistically based models of language will allow deep understanding of textual reports. The success of these statistical methods will depend on the availability of large numbers of high-quality training examples for each radiologic subdomain. The acceptability of automated structured reporting systems will ultimately depend on the results of comprehensive evaluations.

Automatic Conversion of Clinical Notes into SNOMED CT at Point of Care

2006

Current information systems in use in the medicine require clinicians to enter data into structured entry fields, where the type of data needs be known and the resulting interfaces are inflexible. Natural Language Processing (NLP) can be used in this field to allow a more free and natural method for clinicians to record facts about patients. This paper describes a generic interface with the ability to automatically classify the free text. This was achieved through two approaches; creating an algorithm utilising NLP to automatically classify input text into SNOMED CT codes, and a generic way of generating interfaces to allow for handling of input data without having to put information into specific fields. A generic interface generator was written and tested with various sample interface descriptions. While the interface generation was generic the handling of the localised interface was identical to hand crafted interfaces in both style and capabilities. The NLP component was written and tested with samples taken from the medical ontology and worked to within acceptable time limits and accuracy limits for simplified input. The capabilities of the algorithm for detecting medical ontology terms are described. The generic nature of the interface generation also demonstrates how strongly localised the interfaces can be allowed for different communities without loss of processing generality.

A General Natural-language Text Processor for Clinical Radiology

Journal of the American Medical Informatics Association, 1994

Objective: D evelopment of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms. Design: The natural-language processor provides three phases of processing, all of which are driven by different knowledge sources. The first phase performs the parsing. It identifies the structure of the text through use of a grammar that defines semantic patterns and a target form. The second phase, regularization, standardizes the terms in the initial target structure via a compositional mapping of multi-word phrases. The third phase, encoding, maps the terms to a controlled vocabulary. Radiology is the test domain for the processor and the target structure is a formal model for representing clinical information in that domain. Measurements: The impression sections of 230 radiology reports were encoded by the processor. Results of an automated query of the resultant database for the occurrences of f&r diseases were compared with the analysis of a panel of three physicians to determine recall and precision. Results: Without training specific to the four diseases, recall and precision of the system (combined effect of the processor and query generator) were 70% and 87%. Training of the query component increased recall to 85% without changing precision

Evaluation of SNOMED3.5 in representing concepts in chest radiology reports: integration of a SNOMED mapper with a radiology reporting workstation

Proceedings. AMIA Symposium, 2000

Standardized medical terminologies are gaining importance in the representation of medical data. In this paper, we present the evaluation of the SNOMED3.5 medical terminology to code concepts routinely used in chest radiology reports. Integration of this terminology mapper into a radiology reporting workstation that incorporates a speech recognition system and a natural language processor is also discussed. A total of 700 anatomical location terms (including synonyms) were tested and 72% of the terms had corresponding SNOMED terms. Of the 28% that did not result in a match, 16% were either morphological variants of SNOMED terms or could be found from a combination of terms from two or more SNOMED axes. Only 12% of the terms (primarily specialized radiology terms) were concepts not actually included in the SNOMED terminology.

Automating Quality Control for Structured Standardized Radiology Reports Using Text Analysis

2020

Radiology reports describe the findings of a radiologist in an imaging examination, produced for another clinician in order to answer to a clinical indication. Sometimes, the report does not fully answer the question asked, despite guidelines for the radiologist. In this article, a system that controls the quality of reports automatically is described. It notably maps the free text onto MeSH terms and checks if the anatomy and disease terms match in the indication and conclusion of a report. The agreement between manual checks of experienced radiologists and the system is high with automatic checks requiring only a fraction of time. Being able to quality control all reports has the potential to improve report quality and thus limit misunderstandings, loosing time for requesting more information and possibly avoid medical mistakes.

Informatics in Radiology RADTF: A Semantic Search-enabled, Natural Language Processor-generated Radiology Teaching File

2010

Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex ®-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material. ©

Transformer-based structuring of free-text radiology report databases

European Radiology, 2023

Objectives To provide insights for on-site development of transformer-based structuring of free-text report databases by investigating different labeling and pre-training strategies. Methods A total of 93,368 German chest X-ray reports from 20,912 intensive care unit (ICU) patients were included. Two labeling strategies were investigated to tag six findings of the attending radiologist. First, a system based on human-defined rules was applied for annotation of all reports (termed "silver labels"). Second, 18,000 reports were manually annotated in 197 h (termed "gold labels") of which 10% were used for testing. An on-site pre-trained model (T mlm ) using masked-language modeling (MLM) was compared to a public, medically pre-trained model (T med ). Both models were fine-tuned on silver labels only, gold labels only, and first with silver and then gold labels (hybrid training) for text classification, using varying numbers (N: 500, 1000, 2000, 3500, 7000, 14,580) of gold labels. Macro-averaged F1-scores (MAF1) in percent were calculated with 95% confidence intervals (CI). Results T mlm,gold (95.5 [94.5-96.3]) showed significantly higher MAF1 than T med,silver (75.0 [73.4-76.5]) and T mlm,silver (75.2 [73.6-76.7]), but not significantly higher MAF1 than T med,gold (94.7 [93.6-95.6]), T med,hybrid (94.9 [93.9-95.8]), and T mlm,hybrid (95.2 [94.3-96.0]). When using 7000 or less gold-labeled reports, T mlm,gold (N: 7000, 94.7 [93.5-95.7]) showed significantly higher MAF1 than T med,gold (N: 7000, 91.5 [90.0-92.8]). With at least 2000 gold-labeled reports, utilizing silver labels did not lead to significant improvement of T mlm,hybrid (N: 2000, 91.8 [90.4-93.2]) over T mlm,gold (N: 2000, 91.4 [89.9-92.8]). Conclusions Custom pre-training of transformers and fine-tuning on manual annotations promises to be an efficient strategy to unlock report databases for data-driven medicine. • On-site development of natural language processing methods that retrospectively unlock free-text databases of radiology clinics for data-driven medicine is of great interest. • For clinics seeking to develop methods on-site for retrospective structuring of a report database of a certain department, it remains unclear which of previously proposed strategies for labeling reports and pre-training models is the most appropriate in context of, e.g., available annotator time. • Using a custom pre-trained transformer model, along with a little annotation effort, promises to be an efficient way to retrospectively structure radiological databases, even if not millions of reports are available for pre-training. Radiology • Deep learning • Natural language processing • Intensive care units • Thorax Abbreviations CI Confidence interval CVC Central venous catheter ICU Intensive care unit MAAUC Macro-averaged area under the receiver operating characteristic curve MAF1 Macro-averaged F1-score ML Machine learning MLM Masked-language modeling NLP Natural language processing TFIDF Term frequency-inverse document frequency S. Nowak and D. Biesner contributed equally as joint first authors. U.I. Attenberger, R. Sifa, and A.M. Sprinkart contributed equally as joint last authors.

Computerized Radiology Reporting Using Coded Language

Radiology, 1974

A logical and comprehensive classification code has been built into a computerized system to permit' the direct generation of radiologic reports. The report appears instantly on the television screen and, upon approval, is typed out both in the patient's area and radiology department, bypassing the traditional transcription and delivery process. No secretary or mail clerk is needed. Since preceded input is used, data retrieval will be highly efficient and accurate. Computer operating costs will be competitive with conventional manual reporting. The CLIP (Coded Language Information Processing) system is presently undergoing clinical trial.