Toward systematic review automation: a practical guide to using machine learning tools in research synthesis - PubMed (original) (raw)

Editorial

Toward systematic review automation: a practical guide to using machine learning tools in research synthesis

Iain J Marshall et al. Syst Rev. 2019.

Abstract

Technologies and methods to speed up the production of systematic reviews by reducing the manual labour involved have recently emerged. Automation has been proposed or used to expedite most steps of the systematic review process, including search, screening, and data extraction. However, how these technologies work in practice and when (and when not) to use them is often not clear to practitioners. In this practical guide, we provide an overview of current machine learning methods that have been proposed to expedite evidence synthesis. We also offer guidance on which of these are ready for use, their strengths and weaknesses, and how a systematic review team might go about using them in practice.

Keywords: Evidence synthesis; Machine learning; Natural language processing.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1

Classifying text using machine learning, in this example logistic regression with a ‘bag of words’ representation of the texts. The system is ‘trained’, learning a coefficient (or weight) for each unique word in a manually labelled set of documents (typically in the 1000s). In use, the learned coefficients are used to predict a probability for an unknown document

Fig. 2

Bag of words modelling for classifying RCTs. Top left: Example of bag of words for three articles. Each column represents a unique word in the corpus (a real example would likely contain columns for 10,000s of words). Top right: Document labels, where 1 = relevant and 0 = irrelevant. Bottom: Coefficients (or weights) are estimated for each word (in this example using logistic regression). In this example, high +ve weights will increase the predicted probability that an unseen article is an RCT where it contains the words ‘random’ or ‘randomized’. The presence of the word ‘systematic’ (with a large negative weight) would reduce the predicted probability that an unseen document is an RCT

Fig. 3

Schematic of a typical data extraction process. The above illustration concerns the example task of extracting the study sample size. In general, these tasks involve labelling individual words. The word (or ‘token’) at position t is represented by a vector. This representation may encode which word is at this position and likely also communicates additional features, e.g. whether the word is capitalized or if the word is (inferred to be) a noun. Models for these kinds of tasks attempt to assign labels all T words in a document and for some tasks will attempt to maximize the joint likelihood of these labels to capitalize on correlations between adjacent labels

Fig. 4

Typical workflow for semi-automated abstract screening. The asterisk indicates that with uncertainty sampling, the articles which are predicted with least certainty are presented first. This aims to improve the model accuracy more efficiently

Cited by

An exploration of available methods and tools to improve the efficiency of systematic review production: a scoping review.
Affengruber L, van der Maten MM, Spiero I, Nussbaumer-Streit B, Mahmić-Kaknjo M, Ellen ME, Goossen K, Kantorova L, Hooft L, Riva N, Poulentzas G, Lalagkas PN, Silva AG, Sassano M, Sfetcu R, Marqués ME, Friessova T, Baladia E, Pezzullo AM, Martinez P, Gartlehner G, Spijker R. Affengruber L, et al. BMC Med Res Methodol. 2024 Sep 18;24(1):210. doi: 10.1186/s12874-024-02320-4. BMC Med Res Methodol. 2024. PMID: 39294580 Review.
(Semi)automated approaches to data extraction for systematic reviews and meta-analyses in social sciences: A living review.
Legate A, Nimon K, Noblin A. Legate A, et al. F1000Res. 2024 Jun 20;13:664. doi: 10.12688/f1000research.151493.1. eCollection 2024. F1000Res. 2024. PMID: 39220382 Free PMC article. Review.
A question-answering framework for automated abstract screening using large language models.
Akinseloyin O, Jiang X, Palade V. Akinseloyin O, et al. J Am Med Inform Assoc. 2024 Sep 1;31(9):1939-1952. doi: 10.1093/jamia/ocae166. J Am Med Inform Assoc. 2024. PMID: 39042516 Free PMC article.
How to optimize the systematic review process using AI tools.
Fabiano N, Gupta A, Bhambra N, Luu B, Wong S, Maaz M, Fiedorowicz JG, Smith AL, Solmi M. Fabiano N, et al. JCPP Adv. 2024 Apr 23;4(2):e12234. doi: 10.1002/jcv2.12234. eCollection 2024 Jun. JCPP Adv. 2024. PMID: 38827982 Free PMC article. Review.
Evaluating the OpenAI's GPT-3.5 Turbo's performance in extracting information from scientific articles on diabetic retinopathy.
Gue CCY, Rahim NDA, Rojas-Carabali W, Agrawal R, Rk P, Abisheganaden J, Yip WF. Gue CCY, et al. Syst Rev. 2024 May 16;13(1):135. doi: 10.1186/s13643-024-02523-2. Syst Rev. 2024. PMID: 38755704 Free PMC article.

References

1. Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med. 2010;7:e1000326. doi: 10.1371/journal.pmed.1000326. - DOI - PMC - PubMed
1. Allen IE, Olkin I. Estimating time to conduct a meta-analysis from number of citations retrieved. JAMA. 1999;282:634–635. doi: 10.1001/jama.282.7.634. - DOI - PubMed
1. Borah R, Brown AW, Capers PL, Kaiser KA. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545. doi: 10.1136/bmjopen-2016-012545. - DOI - PMC - PubMed
1. Johnston E. How quickly do systematic reviews go out of date? A survival analysis. J Emerg Med. 2008;34:231. doi: 10.1016/j.jemermed.2007.11.022. - DOI
1. Tsafnat G., Dunn A., Glasziou P., Coiera E. The automation of systematic reviews. BMJ. 2013;346(jan10 1):f139–f139. doi: 10.1136/bmj.f139. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Toward systematic review automation: a practical guide to using machine learning tools in research synthesis - PubMed (original) (raw)