Flexible design for following up positive findings - PubMed (original) (raw)

Flexible design for following up positive findings

Kai Yu et al. Am J Hum Genet. 2007 Sep.

Abstract

As more population-based studies suggest associations between genetic variants and disease risk, there is a need to improve the design of follow-up studies (stage II) in independent samples to confirm evidence of association observed at the initial stage (stage I). We propose to use flexible designs developed for randomized clinical trials in the calculation of sample size for follow-up studies. We apply a bootstrap procedure to correct the effect of regression to the mean, also called "winner's curse," resulting from choosing to follow up the markers with the strongest associations. We show how the results from stage I can improve sample size calculations for stage II adaptively. Despite the adaptive use of stage I data, the proposed method maintains the nominal global type I error for final analyses on the basis of either pure replication with the stage II data only or a joint analysis using information from both stages. Simulation studies show that sample-size calculations accounting for the impact of regression to the mean with the bootstrap procedure are more appropriate than is the conventional method. We also find that, in the context of flexible design, the joint analysis is generally more powerful than the replication analysis.

PubMed Disclaimer

Figures

Figure  1.

Figure 1.

Threshold for the stage II P value for the rejection of final analysis as a function of stage I P value. The familywise type I error rate (

α

) is 0.01 with 40 independent hypotheses. The targeted conditional power (

1-β

) is 0.9. The marker selection criterion (

α1

) is 0.05.

Figure  2.

Figure 2.

“Unconditional” power of the adaptive two-stage procedure by use of the joint statistic under various ORs and stage I marker selection criterion (

α1

). The stage I sample size is 500 cases and 500 controls. The familywise error rate is controlled at 0.01 with a total of 41 independent hypotheses. For each simulated stage I data set, the marker with the lowest stage I P value is used for stage II sample-size calculation. Its effect size is estimated by the bootstrap method. Stage II sample size is calculated using the joint statistic for the corresponding target conditional power. The “unconditional” power is estimated according to formula (9) on the basis of 2,000 simulated stage I data sets.

Figure  3.

Figure 3.

“Unconditional” power comparison between the two-stage procedure using the joint statistic and that using the replication statistic (Repl.) under various ORs and stage I marker-selection criterion (

α1

). The stage I sample size is 500 cases and 500 controls. The familywise error rate is controlled at 0.01, with a total of 41 independent hypotheses. For each simulated stage I data set, the marker with the lowest stage I P value is used for stage II sample-size calculation. Its effect size is estimated by the bootstrap method. The stage II sample size is calculated using the replication-based test statistic for the corresponding target conditional power. The same sample-size decision rule is applied to both procedures, to ensure a fair comparison. The “unconditional” power is estimated according to formula (9) on the basis of 2,000 simulated stage I data sets.

Similar articles

Cited by

References

Web Resources

    1. K.Y.’s Web site, http://dceg.cancer.gov/about/staff-bios/Yu-Kai (for software)
    1. Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim/ (for NHL, TNF, and LTA) - PubMed

References

    1. Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K (2002) A comprehensive review of genetic association studies. Genet Med 4:45–61 - PubMed
    1. Moonesinghe R, Khoury MJ, Cecile A, Janssens JW (2007) Most published research findings are false—but a little replication goes a long way. PLoS Med 4:e2810.1371/journal.pmed.0040028 - DOI - PMC - PubMed
    1. NCI-NHGRI Working Group on Replication in Association Studies, Chanock SJ, Manolio T, Boehnke M, Boerwinkle E, Hunter DJ, Thomas G, Abecasis G, Altshuler D, Bailey-Wilson JE, et al (2007) Replicating genotype-phenotype associations. Nature 447:655–66010.1038/447655a - DOI - PubMed
    1. Goring H, Terwilliger JD, Blangero J (2001) Large upward bias in estimation of locus-specific effects from genomewide scan. Am J Hum Genet 69:1357–1369 - PMC - PubMed
    1. Allison D, Fermandez JR, Heo M, Zhu S, Etzel C, Beasley TM, Amos CI (2002) Bias in estimates of quantitative-trait-locus effect in genome scans: demonstration of the phenomenon and a method-of-moments procedure for reducing bias. Am J Hum Genet 70:575–585 - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources