Leveraging Large Language Models to Enhance Radiology Report Readability: A Systematic Review - PubMed (original) (raw)
Review
. 2026 Mar;23(3):354-361.
doi: 10.1016/j.jacr.2025.09.004. Epub 2025 Sep 11.
Affiliations
- PMID: 40945554
- DOI: 10.1016/j.jacr.2025.09.004
Review
Leveraging Large Language Models to Enhance Radiology Report Readability: A Systematic Review
Vasant Patwardhan et al. J Am Coll Radiol. 2026 Mar.
Abstract
Background: Patients increasingly have direct access to their medical record. Radiology reports are complex and difficult for patients to understand and contextualize. One solution is to use large language models (LLMs) to translate reports into patient-accessible language.
Objective: This review summarizes the existing literature on using LLMs for the simplification of patient radiology reports. We also propose guidelines for best practices in future studies.
Evidence acquisition: A systematic review was performed following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Studies published and indexed using PubMed, Scopus, and Google Scholar up to February 2025 were included. Inclusion criteria comprised studies that used LLMs for simplification of diagnostic or interventional radiology reports for patients and evaluated readability. Exclusion criteria included non-English publications, abstracts, conference presentations, review articles, retracted articles, and studies that did not focus on report simplification. The Mixed Methods Appraisal tool 2018 was used for bias assessment. Given the diversity of results, studies were categorized based on reporting methods, and qualitative and quantitative findings were presented to summarize key insights.
Evidence synthesis: A total of 2,126 citations were identified and 17 were included in the qualitative analysis. Of these studies, 71% used a single LLM, and 29% of studies used multiple LLMs. The most prevalent LLMs included ChatGPT, Google Bard/Gemini, Bing Chat, Claude, and Microsoft Copilot. All studies that assessed quantitative readability metrics (n = 12) reported improvements. Assessment of simplified reports via qualitative methods demonstrated varied results with physician versus nonphysician raters.
Conclusion and clinical impact: LLMs demonstrate the potential to enhance the accessibility of radiology reports for patients, but the literature is limited by heterogeneity of inputs, models, and evaluation metrics across existing studies. We propose a set of best practice guidelines to standardize future LLM research.
Copyright © 2025 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources