Employing computational linguistics techniques to identify limited patient health literacy: Findings from the ECLIPPSE study - PubMed (original) (raw)

Employing computational linguistics techniques to identify limited patient health literacy: Findings from the ECLIPPSE study

Dean Schillinger et al. Health Serv Res. 2021 Feb.

Abstract

Objective: To develop novel, scalable, and valid literacy profiles for identifying limited health literacy patients by harnessing natural language processing.

Data source: With respect to the linguistic content, we analyzed 283 216 secure messages sent by 6941 diabetes patients to physicians within an integrated system's electronic portal. Sociodemographic, clinical, and utilization data were obtained via questionnaire and electronic health records.

Study design: Retrospective study used natural language processing and machine learning to generate five unique "Literacy Profiles" by employing various sets of linguistic indices: Flesch-Kincaid (LP_FK); basic indices of writing complexity, including lexical diversity (LP_LD) and writing quality (LP_WQ); and advanced indices related to syntactic complexity, lexical sophistication, and diversity, modeled from self-reported (LP_SR), and expert-rated (LP_Exp) health literacy. We first determined the performance of each literacy profile relative to self-reported and expert-rated health literacy to discriminate between high and low health literacy and then assessed Literacy Profiles' relationships with known correlates of health literacy, such as patient sociodemographics and a range of health-related outcomes, including ratings of physician communication, medication adherence, diabetes control, comorbidities, and utilization.

Principal findings: LP_SR and LP_Exp performed best in discriminating between high and low self-reported (C-statistics: 0.86 and 0.58, respectively) and expert-rated health literacy (C-statistics: 0.71 and 0.87, respectively) and were significantly associated with educational attainment, race/ethnicity, Consumer Assessment of Provider and Systems (CAHPS) scores, adherence, glycemia, comorbidities, and emergency department visits.

Conclusions: Since health literacy is a potentially remediable explanatory factor in health care disparities, the development of automated health literacy indicators represents a significant accomplishment with broad clinical and population health applications. Health systems could apply literacy profiles to efficiently determine whether quality of care and outcomes vary by patient health literacy; identify at-risk populations for targeting tailored health communications and self-management support interventions; and inform clinicians to promote improvements in individual-level care.

Keywords: communication; diabetes; health literacy; machine learning; managed care; natural language processing; secure messaging.

© 2020 The Authors. Health Services Research published by Wiley Periodicals LLC on behalf of Health Research and Educational Trust.

PubMed Disclaimer

Figures

FIGURE 1

FIGURE 1

Patient and secure messages inclusion/exclusion flowchart*. *MRN#: Patient ID; msg_date: Date of message sent; Svy: survey; SM#: number of secure messages; LP: literacy profile; PCP_ID: primary care provider ID; proxy_pct: % of proxy messages; TOFROM_PAT_C: SM sent by the patient

FIGURE 2

FIGURE 2

ROCs and performance metrics for the literacy profiles relative to self‐reported health literacy. AUC: Area Under Curve; LP_Exp: Literacy Profile Expert‐Rated Health Literacy; LP_FK: Literacy Profile Flesch‐Kincaid; LP_LD: Literacy Profile Lexical Diversity; LP_SR: Literacy Profile Self‐Reported Health Literacy; LP_WQ: Literacy Profile Writing Quality; ML: Machine Learning; SVM: Support Vector Machine [Color figure can be viewed at

wileyonlinelibrary.com

]

FIGURE 3

FIGURE 3

ROCs and performance metrics for the literacy profiles relative to expert‐rated literacy. AUC: Area Under Curve; LDA: Linear Discriminant Analysis; LP_Exp: Literacy Profile Expert‐Rated Health Literacy; LP_FK: Literacy Profile Flesch‐Kincaid; LP_LD: Literacy Profile Lexical Diversity; LP_SR: Literacy Profile Self‐Reported Health Literacy; LP_WQ: Literacy Profile Writing Quality; ML: Machine Learning; SVM: Support Vector Machine [Color figure can be viewed at

wileyonlinelibrary.com

]

References

    1. Stewart MA. Effective physician‐patient communication and health outcomes: a review. Can Med Assoc J. 1995;152(9):1423. -PMC -PubMed
    1. Ratanawongsa N, Karter AJ, Parker MM, et al. Communication and medication refill adherence: the Diabetes Study of Northern California. JAMA Intern Med. 2013;173(3):210‐218. -PMC -PubMed
    1. Centers for Disease Control and Prevention . Diabetes Report Card 2017. Atlanta, GA: Centers for Disease Control and Prevention, US Dept of Health and Human Services; 2018.
    1. Bailey SC, Brega AG, Crutchfield TM, et al. Update on health literacy and diabetes. Diabetes Educator. 2014;40(5):581‐604. -PMC -PubMed
    1. Bauer AM, Schillinger D, Parker MM, et al. Health literacy and antidepressant medication adherence among adults with diabetes: the diabetes study of Northern California (DISTANCE). J Gen Intern Med. 2013;28(9):1181‐1187. -PMC -PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources