Biases Arising from Using Linked Administrative Data for Research: A Conceptual Framework from Registration to Analysis. (original) (raw)

Shaw, Richard ORCID logoORCID: https://orcid.org/0000-0002-7906-6066, Harron, Katie, Pescarini, Julia, Júnior, Elzo, Siroky, Andressa, Campbell, Desmond ORCID logoORCID: https://orcid.org/0000-0003-1085-714X, Dundas, Ruth ORCID logoORCID: https://orcid.org/0000-0002-3836-4286, Ichihara, Maria Yury, Barreto, Mauricio and Katikireddi, Vittal ORCID logoORCID: https://orcid.org/0000-0001-6593-9092(2022) Biases Arising from Using Linked Administrative Data for Research: A Conceptual Framework from Registration to Analysis. In: 2022 International Population Data Linkage, Edinburgh, UK, 7-9 Sept 2022,(doi: 10.23889/ijpds.v7i3.1800)

Abstract

Objectives: Administrative data are primarily collected for operational processes and these processes can lead to sources of bias that may not be adequately considered by researchers. We provide a framework to help understand how biases might arise from using linked administrative data, and hopefully aid future study designs. Approach: We developed the conceptual framework based on the team’s experiences with the 100 Million Brazilian Cohort (100MCohort) which contains records of more than 131 million people whose families applied for social assistance between 2001 and 2018, linked to other administrative data sources. We provide examples from the 100MCohort of where and how in the linkage process different forms of bias could arise. We make recommendations on how biases might be addressed using commonly available external data. Results: The conceptual framework covers the whole data generating process from people and events occurring in the population through to deriving variables for analysis. The framework comprises three distinct stages: 1) Recording and registration of events in administrative systems such as Brazil’s Mortality Information System (SIM) and the Hospital Information System (SIH); 2) Linkage of different data sources, for example using exact matching via the Social Identification Number (NIS) in Brazil’s CadÚnico database or linkage algorithms; 3) Cleaning and coding data used both for analysis and linkage. The biases arising from linkage can be better understood by applying theory and making additional metadata available. Conclusion: Maximising the potential of administrative data for research requires a better understanding of how biases arise. This is best achieved by considering the entire data generating process, and better communication among all those involved in the data collection and linkage processes.

Item Type: Conference Proceedings
Status: Published
Refereed: Yes
Glasgow Author(s) Enlighten ID: Katikireddi, Professor Vittal and Campbell, Dr Desmond and Dundas, Professor Ruth and Shaw, Dr Richard
Authors: Shaw, R., Harron, K., Pescarini, J., Júnior, E., Siroky, A., Campbell, D., Dundas, R., Ichihara, M. Y., Barreto, M., and Katikireddi, V.
College/School: College of Medical Veterinary and Life Sciences > School of Health & Wellbeing > MRC/CSO SPHSU
Journal Name: International Journal of Population Data Science
ISSN: 2399-4908
Copyright Holders: Copyright © 2022 The Authors
First Published: First published in International Journal of Population Data Science 7(3):29
Publisher Policy: Reproduced under a Creative Commons License
Related URLs: OrganisationPubMed UK

University Staff: Request a correction | Enlighten Editors: Update this record

Funder and Project Information

Strengthening data linkage to reduce health inequalities in low and middle income countries: building on the Brazilian 100 million cohort

Alastair Leyland

16/137/99

HW - MRC/CSO Social and Public Health Sciences Unit

Deposit and Record Details

ID Code: 278726
Depositing User: Dr Richard Shaw
Datestamp: 06 Sep 2022 07:37
Last Modified: 02 May 2025 07:27
Date of acceptance: 16 May 2022
Date of first online publication: 25 August 2022
Date Deposited: 6 September 2022