A Proposed Method to Adjust for Selection Bias in Cohort Studies (original) (raw)

Adjusting for selection bias in retrospective, case–control studies

2009

Retrospective case control studies are more susceptible to selection bias than other epidemiologic studies as by design they require that both cases and controls are representative of the same population. However, as cases and control recruitment processes are often different, it is not always obvious that the necessary exchangeability conditions hold. Selection bias typically arises when the selection criteria are associated with the risk factor under investigation. We develop a method which produces biasadjusted estimates for the odds ratio. Our method hinges on two conditions. The first is that a variable that separates the risk factor from the selection criteria can be identified. This is termed the bias breaking variable. The second condition is that data can be found such that a bias-corrected estimate of the distribution of the bias breaking variable can be obtained. We show by means of a set of examples that such bias breaking variables are not uncommon in epidemiologic settings. We demonstrate using simulations that the estimates of the odds ratios produced by our method are consistently closer to the true odds ratio than standard odds ratio estimates using logistic regression. Further, by applying it to a case control study, we show that our method can help to determine whether selection bias is present and thus confirm the validity of study conclusions when no evidence of selection bias can be found. selection bias, directed acyclic graphs, conditional independence, confounding, retrospective case control studies, post-stratification, weighting

Exploring the impact of selection bias in observational studies of COVID-19: a simulation study

International Journal of Epidemiology

Background Non-random selection of analytic subsamples could introduce selection bias in observational studies. We explored the potential presence and impact of selection in studies of SARS-CoV-2 infection and COVID-19 prognosis. Methods We tested the association of a broad range of characteristics with selection into COVID-19 analytic subsamples in the Avon Longitudinal Study of Parents and Children (ALSPAC) and UK Biobank (UKB). We then conducted empirical analyses and simulations to explore the potential presence, direction and magnitude of bias due to this selection (relative to our defined UK-based adult target populations) when estimating the association of body mass index (BMI) with SARS-CoV-2 infection and death-with-COVID-19. Results In both cohorts, a broad range of characteristics was related to selection, sometimes in opposite directions (e.g. more-educated people were more likely to have data on SARS-CoV-2 infection in ALSPAC, but less likely in UKB). Higher BMI was ass...

Exploring selection bias in COVID-19 research: Simulations and prospective analyses of two UK cohort studies

2021

ABSTRACTBackgroundNon-random selection into analytic subsamples could introduce selection bias in observational studies of SARS-CoV-2 infection and COVID-19 severity (e.g. including only those have had a COVID-19 PCR test). We explored the potential presence and impact of selection in such studies using data from self-report questionnaires and national registries.MethodsUsing pre-pandemic data from the Avon Longitudinal Study of Parents and Children (ALSPAC) (mean age=27.6 (standard deviation [SD]=0.5); 49% female) and UK Biobank (UKB) (mean age=56 (SD=8.1); 55% female) with data on SARS-CoV-2 infection and death-with-COVID-19 (UKB only), we investigated predictors of selection into COVID-19 analytic subsamples. We then conducted empirical analyses and simulations to explore the potential presence, direction, and magnitude of bias due to selection when estimating the association of body mass index (BMI) with SARS-CoV-2 infection and death-with-COVID-19.ResultsIn both ALSPAC and UKB ...

Statistical methods for long-term follow-up of infectious diseases

2014

The overall aim of this work has been to investigate methodological issues connected to long-term follow-up of infections diseases. The work extended to prevalent cohorts in general. The common denominator for the main methodological efforts in these four papers is issues connected to selection bias. In the first three papers methods for visualizing selection bias in prevalent cohorts were explored and different approaches to adjust for this bias discussed. In the fourth paper, capture-recapture modeling was used to examine ascertainment level for liver cancer in the Swedish Cancer Register. Study 1: In this study we investigated a novel approach to visualize and adjust for selection bias in prevalent cohorts. The method is an extension of the standard interval-based approach, where a risk estimate is calculated for disjointed time periods after inclusion in the cohort of interest. In the proposed method, observation time and events are cumulated, giving more power and more precise ...

Cohort restriction based on prior enrollment: Examining potential biases in estimating cancer and mortality risk

Observational Studies

Electronic health records and administrative databases provide rich, longitudinal data for health-related research. These data cover large, diverse populations creating excellent research opportunities, but have limitations. In particular, information is available only for individuals who are enrolled in a particular health system; thus, studies often exclude individual's with short enrollment history. Such cohort restriction may cause selection bias in absolute risk estimates for the full enrollee population. We use hazard ratios (HRs) to estimate the association between length of prior enrollment and cancer and all-cause mortality risk. HRs different from one indicate restricted cohorts would produce biased risk estimates for the full enrollee population. Our study sample included 170,708 enrollees of a Western Washington healthcare delivery system. Unadjusted models found individuals with 10 or more years of prior enrollment had higher risk of cancer and death compared to those with less than 5 years prior enrollment (HRs ranged from 1.29 − 3.01). Age-and sexadjusted models accounted for much of this difference (HRs: 0.93−1.24). Models adjusting for additional covariates had similar results (HRs: 0.91 − 1.14). After evaluating potential selection bias, we conclude that, in this setting, age-and sex-standardizing risk estimates can remove most of the bias due to lengthy, prior-enrollment cohort restrictions. Before generalizing estimates based on a selected sample of patients meeting prior enrollment criteria, researchers should assess the potential for selection bias.

Uncovering selection bias in case-control studies using Bayesian post-stratification

2013

Selection bias can affect odds ratio estimation in particular in case-control studies. Approaches to discovering and adjusting for selection bias have been proposed in the literature using graphical and heuristic tools as well as more complex statistical methods. The approach we propose is based on a survey weighting method termed Bayesian post-stratification and follows from the conditional independences that characterise selection bias. We use our approach to perform a selection bias sensitivity analysis of odds ratios by using ancillary data sources that describe the target case-control population to re-weight the parameter estimates obtained from the study. The method is tested on two case-control studies, the first investigating the association between exposure to electromagnetic fields and acute lymphoblastic leukaemia and the second investigating the association between occupational exposure to hairspray and a minor congenital malformation called hypospadias. In both case-control studies, the odds ratios were only moderately sensitive to selection bias.

Future Cases as Present Controls to Adjust for Exposure Trend Bias in Case-only Studies

Epidemiology, 2011

Self-matched case-only studies (such as the case-crossover or self-controlled case-series method) control by design for time-invariant confounders (measured or unmeasured), but they do not control for confounders that vary with time. A bidirectional case-crossover design can be used to adjust for exposure-time trends. In pharmacoepidemiology, however, illness often influences future use of medications, making a bidirectional design problematic. Suissa's case-time-control design combines a case-crossover and case-control design, and adjusts for exposure-trend bias in the cases' self-controlled odds ratio by dividing that ratio by the corresponding self-controlled odds ratio in a concurrent matched control group. However, if not well matched, the control group may re-introduce selection bias. We propose a "case-case-time control" that involves crossover analyses in cases and future-case controls. This person-time sampling strategy improves matching by restricting controls to future cases. We evaluate the proposed study design through simulations and analysis of a theoretically null relationship using Veterans Administration (VA) data. Simulation studies show that the case-case-time control can adjust for exposure trends while controlling for time-invariant confounders. Use of an inappropriate control group left case-timecontrol analyses biased by exposure-time trends. When analyzing the relationship between vitamin exposure and stroke, using data on 3192 patients in the VA system, a case-crossover odds ratio of 1.5 (95% confidence interval = 1.3-1.7) was reduced to 1.1 (0.9-1.3) when divided by the concurrent exposure trend odds ratio (1.4) in matched future cases. This applied example demonstrates how our approach can adjust for exposure trends observed across time axes.