An automation framework for clinical codelist development validated with UK data from patients with multiple long-term conditions (original) (raw)

Aslam, A. et al. (2025) An automation framework for clinical codelist development validated with UK data from patients with multiple long-term conditions.BMC Medical Research Methodology, 25(1), 138. (doi: 10.1186/s12874-025-02541-1) (PMID:40413381) (PMCID:PMC12102889)

Abstract

Background: Codelists play a crucial role in ensuring accurate and standardized communication within healthcare. However, preparation of high-quality codelists is a rigorous and time-consuming process. The literature focuses on transparency of clinical codelists and overlooks the utility of automation. Methods: (Automated Framework Design and Use‑case: DynAIRx) Here we present a Codelist Generation Framework that can automate generation of codelists with minimal input from clinical experts. We demonstrate the process using a specific project, DynAIRx, producing appropriate codelists and a framework allowing future projects to take advantage of automated codelist generation. Both the framework and codelist are publicly available. DynAIRx is an NIHR-funded project aiming to develop AIs to help optimise prescribing of medicines in patients with multiple long-term conditions. DynAIRx requires complex codelists to describe the trajectory of each patient, and the interaction between their conditions. We promptly generated ≈214 codelists for DynAIRx using the proposed framework and validated them with a panel of experts, significantly reducing the amount of time required by making effective use of automation. Results: The framework reduced the clinician time required to validate codes, automatically shrunk codelists using trusted sources and added new codes for review against existing codelists. In the DynAIRx case study, a codelist of ≈14000 codes required only 7-9 hours of clinician’s time in the end (while existing methods takes months), and application of the automation framework reduced the workload by >80%. Conclusion: This work examines current methodologies for codelist development and the challenges associated with ensuring transparency and reproducibility. A key benefit of this approach is its emphasis on automation and reliance on trusted sources, which significantly lowers the workload, minimizes human error, and saves substantial time, particularly the time needed from clinical experts

Item Type: Articles
Additional Information: This work is funded by DynAIRx project. DynAIRx has been funded by the National Institute for Health and Care Research (NIHR) Artificial Intelligence for Multiple Long-Term Conditions (AIM) call (NIHR 203986).
Keywords: Codelist, automation, multiple long term conditions (MLTC), SNOMEDs, DynAIRx.
Status: Published
Refereed: Yes
Glasgow Author(s) Enlighten ID: Mair, Professor Frances
Authors: Aslam, A., Walker, L., Abaho, M., Cant, H., O'Connell, M., Abuzour, A.S., Hama, L., Schofield, P., Mair, F.S., Ruddle, R.A., Popoola, O., Sperrin, M., Tsang, J.Y., Shantsila, E., Gabbay, M., Clegg, A., Woodall, A.A., Buchan, I., and Relton, S.D.
College/School: College of Medical Veterinary and Life Sciences > School of Health & Wellbeing > General Practice and Primary Care
Journal Name: BMC Medical Research Methodology
Publisher: BioMed Central
ISSN: 1471-2288
ISSN (Online): 1471-2288
Copyright Holders: Copyright © The Author(s) 2025
First Published: First published in BMC Medical Research Methodology 25(1):138
Publisher Policy: Reproduced under a Creative Commons license

University Staff: Request a correction | Enlighten Editors: Update this record

Deposit and Record Details

ID Code: 357100
Depositing User: Publications Router
Datestamp: 07 Nov 2025 15:42
Last Modified: 08 Nov 2025 02:31
Date of acceptance: 25 March 2025
Date of first online publication: 24 May 2025
Date Deposited: 7 November 2025
Data Availability Statement: Yes