Stephen Senn - Academia.edu (original) (raw)
Papers by Stephen Senn
Statistical Methods for Evaluating Safety in Medical Product Development
Statistical Issues in Drug Development, 2021
Biometrical Journal, 2019
If the number of treatments in a network meta-analysis is large, it may be possible and useful to... more If the number of treatments in a network meta-analysis is large, it may be possible and useful to model the main effect of treatment as random, that is to say as random realizations from a normal distribution of possible treatment effects. This then constitutes a third sort of random effect that may be considered in connection with such analyses. The first and most common models treatment-by-trial interaction as being random and the second, rather rarer, models the main effects of trial as being random and thus permits the recovery of intertrial information. Taking the example of a network meta-analysis of 44 similar treatments in 10 trials, we illustrate how a hierarchical approach to modeling a random main effect of treatment can be used to produce shrunk (toward the overall mean) estimates of effects for individual treatments. As a related problem, we also consider the issue of using a random-effect model for the within-trial variances from trial to trial. We provide a number of possible graphical representations of the results and discuss the advantages and disadvantages of such an approach.
Statistics in Medicine, 2014
The terminological confusion that exists as regards the use of the qualifier 'random effect' in c... more The terminological confusion that exists as regards the use of the qualifier 'random effect' in connection with the analysis of multi-centre trials or in conducting meta-analyses is well illustrated by the otherwise instructive paper by Kahan and Morris [1] in this journal. They refer to 'models using either fixed centre effects or random centre effects (RCE)' (p1137), yet there are two very different types of random effects commonly considered in statistical models. (See chapter 14 of Statistical Issues in Drug Development for a discussion [2]). In fact, Kahan and Morris use RCE to designate each of the two sorts on different occasions within the paper, without drawing attention to the difference. The simulations carried out by the authors illustrate that where there is some imbalance within centres in the numbers of patients on the two arms of a parallel group trial, treating the main effect of centre as random will allow some modest gain in efficiency [3, 4]. This is simply a particular reflection of a fact known at least since Frank Yate's work of three quarters of a century ago [5-7] that where randomised blocks are imbalanced, the totals from blocks reflect (to some extent) the differences between treatments. This permits what in that context is referred to as recovering inter-block information, and this in turn requires block effects to be treated as random, at least implicitly. In the context of a multi-centre trial, it is the centres that become the blocks, and it is treating the main effect of centre as random that allows such recovery. Except in cases of perfect balance, treating such main effects as random leads to treatment estimates with lower standard errors [8]. In practice, the gain in efficiency is very modest [8], and for any given clinical trial for which the distribution of patients by treatment and centre is known, it is a simple matter to calculate bounds on the gains using the patient numbers only. This follows from the fact that if the between-centre variance is zero, the model with random main effects of centre is equivalent to not fitting a centre effect at all, whereas if it is infinite, it is equivalent to the fixed main effects model [4]. If the number of patients in centre i; i D 1 : : : kon treatment j; j D 1; 2 is n ij , the variance is thus proportional to
Clinical Trials in Neurology, 2001
It would be impossible in the space of one brief chapter to provide the reader with even a basic ... more It would be impossible in the space of one brief chapter to provide the reader with even a basic education in medical statistics. It will be assumed, therefore, that the reader already has considerable familiarity with descriptive and inferential statistics and, in addition to knowing about means, medians, variances and standard deviations, has encountered the general framework of hypothesis tests, confidence intervals and so forth and particular applications of them for both continuous and binary outcomes. Suitable texts, in order of increasing difficulty, are those of Campbell and Machin [1], Altman [2] and Fisher and Van Belle [3]. For general advice on clinical trials Pocock [4] or, at a more advanced level, Piantadosi [5] are extremely useful. However, the science of medical statistics is developing rapidly and it may be useful for the trialist to have some overview of the current status and developing trends. This is all that will be attempted in this chapter. More extensive coverage of various statistical issues affecting drug development will be found in Senn [6]. The European Statistical Guideline [7] and International Conference on Harmonisation E9 Guideline are also extremely useful as reminders regarding points which should be covered in any analysis plan.
Research Synthesis Methods, 2011
The late Joe Fleiss was a highly original thinker. To take but one example, a brief letter of his... more The late Joe Fleiss was a highly original thinker. To take but one example, a brief letter of his in Biometrics (Fleiss, 1986), expressing a devastating criticism of the unrealistic models for cross-over trials favored by certain statisticians who prefer ingenuity to judgment, is more important than dozens of papers describing designs to 'deal' with carry-over that have appeared before and since. The third edition (Fleiss et al., 2003) of his classic and important text, Statistical Methods for Rates and Proportions bears many marks of Fleiss's originality; but before discussing the text, the reader is owed some explanation as to why a review is justified of a book that appeared in 2003. In fact, the elapse of time since the third edition is now as great as that between the first (which appeared in 1972) and the second (which appeared in 1980). Indeed, Fleiss died as the book went to press, and it is clear that while keeping much of the spirit of his earlier work, it has benefitted greatly from the energy and sagacity of coauthors Bruce Levin and Myunghee Cho Paik. The reason that a review is justified, despite the progress that continues to be made in the analysis of binary data, is that the text is a classic that remains invaluable as a first port of call when seeking advice on methods for analyzing rates and proportions. Binary data in particular, despite their apparent simplicity, are notorious as a methodological minefield. Not all statisticians will agree that Fleiss is always right but all will surely agree that to consult Fleiss is always valuable. A feature of the book is the way that it emphasizes design as well as analysis and also distinguishes between various designs as regards analyses deemed appropriate. Hence, after introductory chapters covering basics of probability, inference for a single proportion and for a fourfold table, Chapter 4 covers sample size determination and Chapter 5 how to randomize followed by more detailed examinations in Chapters 6-8 of various designs, which while they may all produce fourfold tables are, however, somewhat different and may require different approaches. (Although the extent to which this is considered important continues to be a lively area of statistical dispute.) Designs in which no margins are fixed as well as clinical trials and cohort studies (in which the 'exposure' margins are fixed) and case-control studies, in which the outcome margins are fixed, are covered as are the distinctions between matched and unmatched studies. Chapter 9 covers the comparison of proportions in several independent samples, Chapter 11, logistic and Chapter 12 Poisson regression. The issue of matching and alternatives to it is then picked up in much greater detail in Chapters 13 and 14, which lead on to the more general issue of considering correlated samples, which is covered in Chapter 15. Two chapters on problems of data quality then follow, Chapter 16 is on missing data and Chapter 17 deals with misclassification. Two final chapters pick up more specialist concerns, Chapter 18 is on inter-rater agreement and Chapter 19 on the standardization of rates. The attentive reader will have picked up that I seem to have skipped over Chapter 10 in the above account. This is not because it is missing from my table of contents (however, due to a misprint it is missing from the table in my copy!) but because it is of most relevance to the reader of this journal as it is entitled 'Combining Evidence from Fourfold Tables', and therefore deserves to be considered in more detail. Many matters of importance are covered within this chapter, for example, estimating a common odds ratio using the empirical log-odds transform or the Mantel-Haenszel approach. Also discussed are 'exact' procedures as well as various approaches to testing homogeneity. These methods and other fixed effects methods are not just presented, explained and illustrated but critically discussed. The chapter also has a nice discussion of the
BMJ (Clinical research ed.), Jan 17, 1998
The Lancet, 1998
In the blood: proposed new requirements for registering generic drugs "You can't really say 'simi... more In the blood: proposed new requirements for registering generic drugs "You can't really say 'similar' if it's the same again you want. 'Similar' means something different."-Enderby Outside, Anthony Burgess
Encyclopedia of Statistical Sciences, 2004
Statistics in Medicine, 2002
Statistics in Medicine, 2013
A key paper in modelling patient recruitment in multi-centre clinical trials is that of Anisimov ... more A key paper in modelling patient recruitment in multi-centre clinical trials is that of Anisimov and Fedorov. They assume that the distribution of the number of patients in a given centre in a completed trial follows a Poisson distribution. In a second stage, the unknown parameter is assumed to come from a Gamma distribution. As is well known, the overall Gamma-Poisson mixture is a negative binomial. For forecasting time to completion, however, it is not the frequency domain that is important, but the time domain and that of Anisimov and Fedorov have also illustrated clearly the links between the two and the way in which a negative binomial in one corresponds to a type VI Pearson distribution in the other. They have also shown how one may use this to forecast time to completion in a trial in progress. However, it is not just necessary to forecast time to completion for trials in progress but also for trials that have yet to start. This suggests that what would be useful would be to add a higher level of the hierarchy: over all trials. We present one possible approach to doing this using an orthogonal parameterization of the Gamma distribution with parameters on the real line. The two parameters are modelled separately. This is illustrated using data from 18 trials. We make suggestions as to how this method could be applied in practice.
Statistics in Medicine, 1997
Statistics in Medicine, 1991
The problem of carryover in crossover trials has received a great deal of attention in the statis... more The problem of carryover in crossover trials has received a great deal of attention in the statistical literature. Carryover is just one form of period by treatment interaction; yet a parallel problem of patient by treatment interaction, which may be regarded as dual to that of carryover, has received little attention. We suggest that the phenomenon of patient by treatment interaction requires a repeated measures approach to the analysis of crossover trials. A simple solution using predefined contrasts is presented and illustrated by example.
Statistics in Medicine, 1996
Various aspects of portfolio management and project prioritization within the pharmaceutical indu... more Various aspects of portfolio management and project prioritization within the pharmaceutical industry are examined. It is shown that the cost and probability architecture of a project is a crucial aspect of its value. An appropriate simple tool for ranking projects is the Pearson index. Various difficulties are considered.
Statistics in Medicine, 1983
Statistical regression to the mean predicts that patients selected for abnormalcy will, on the av... more Statistical regression to the mean predicts that patients selected for abnormalcy will, on the average, tend to improve. We argue that most improvements attributed to the placebo effect are actually instances of statistical regression. First, whereas older clinical trials susceptible to regression resulted in a marked improvement in placebo-treated patients, in a modern'series of clinical trials whose design tended to protect against regression, we found no significant improvement (median change 0.3 per cent, p > 0.05) in placebo-treated patients. Secondly, regression can yield sizeable improvements, even among biochemical tests. Among a series of 15 biochemical tests, theoretical estimates of the improvement due to regression by selection of patients as high abnormals (ie. 3 standard deviations above the mean) ranged from 2 5 per cent for serum sodium to 26 per cent for serum lactate dehydrogenase (median 10 per cent); empirical estimates ranged from 3.8 per cent for serum chloride to 37.3 per cent for serum phosphorus (median 9.5 per cent). Thus, we urge caution in interpreting patient improvements as causal effects of our actions and should avoid the conceit of assuming that our personal presence has strong healing powers. KEY WORDS Drug treatment Placebo Statistical regression Computerized medical record Clinical trial Investigators have variously claimed that the placebo is powerful,' that it meters its curative effects in proportion to the severity of the illness' and that it influences both objective and subjective Some authors have advocated a legitimate place for placebo therapy in patient care.' Patients do tend to improve in association with placebo treatment. This association, however, does not by itself prove that the placebo treatment causes the improvement. This paper considers the degree to which statistical regression toward the mean could account for the improvements associated with placebo therapy. We exclude from the scope of our discussion placebo therapy associated with intense conditioning5m6 or body invasion, i.e. needle sticks or surgical incisions. In the first case, the improvements can be attributed to Pavlovian mechanisms and in the second, to neuroendocrine mechanisms.' Most medical prescribing is not associated with either of these circumstances. At the outset, we emphasize that our question regarding the strength of the placebo effect does To whom reprint requests should k addressed.
Statistics in Biopharmaceutical Research, 2011
... 1998) is based on the idea of ROC curves and has been published in the same journal (Brumback... more ... 1998) is based on the idea of ROC curves and has been published in the same journal (Brumback, Pepe, and Alonzo 2006). ... [See Stine and Heyse (2001) in the same jour-nal for some further theoretical development of this idea and also Al-Saleh (2007) who attributed the ...
Statistical Methods for Evaluating Safety in Medical Product Development
Statistical Issues in Drug Development, 2021
Biometrical Journal, 2019
If the number of treatments in a network meta-analysis is large, it may be possible and useful to... more If the number of treatments in a network meta-analysis is large, it may be possible and useful to model the main effect of treatment as random, that is to say as random realizations from a normal distribution of possible treatment effects. This then constitutes a third sort of random effect that may be considered in connection with such analyses. The first and most common models treatment-by-trial interaction as being random and the second, rather rarer, models the main effects of trial as being random and thus permits the recovery of intertrial information. Taking the example of a network meta-analysis of 44 similar treatments in 10 trials, we illustrate how a hierarchical approach to modeling a random main effect of treatment can be used to produce shrunk (toward the overall mean) estimates of effects for individual treatments. As a related problem, we also consider the issue of using a random-effect model for the within-trial variances from trial to trial. We provide a number of possible graphical representations of the results and discuss the advantages and disadvantages of such an approach.
Statistics in Medicine, 2014
The terminological confusion that exists as regards the use of the qualifier 'random effect' in c... more The terminological confusion that exists as regards the use of the qualifier 'random effect' in connection with the analysis of multi-centre trials or in conducting meta-analyses is well illustrated by the otherwise instructive paper by Kahan and Morris [1] in this journal. They refer to 'models using either fixed centre effects or random centre effects (RCE)' (p1137), yet there are two very different types of random effects commonly considered in statistical models. (See chapter 14 of Statistical Issues in Drug Development for a discussion [2]). In fact, Kahan and Morris use RCE to designate each of the two sorts on different occasions within the paper, without drawing attention to the difference. The simulations carried out by the authors illustrate that where there is some imbalance within centres in the numbers of patients on the two arms of a parallel group trial, treating the main effect of centre as random will allow some modest gain in efficiency [3, 4]. This is simply a particular reflection of a fact known at least since Frank Yate's work of three quarters of a century ago [5-7] that where randomised blocks are imbalanced, the totals from blocks reflect (to some extent) the differences between treatments. This permits what in that context is referred to as recovering inter-block information, and this in turn requires block effects to be treated as random, at least implicitly. In the context of a multi-centre trial, it is the centres that become the blocks, and it is treating the main effect of centre as random that allows such recovery. Except in cases of perfect balance, treating such main effects as random leads to treatment estimates with lower standard errors [8]. In practice, the gain in efficiency is very modest [8], and for any given clinical trial for which the distribution of patients by treatment and centre is known, it is a simple matter to calculate bounds on the gains using the patient numbers only. This follows from the fact that if the between-centre variance is zero, the model with random main effects of centre is equivalent to not fitting a centre effect at all, whereas if it is infinite, it is equivalent to the fixed main effects model [4]. If the number of patients in centre i; i D 1 : : : kon treatment j; j D 1; 2 is n ij , the variance is thus proportional to
Clinical Trials in Neurology, 2001
It would be impossible in the space of one brief chapter to provide the reader with even a basic ... more It would be impossible in the space of one brief chapter to provide the reader with even a basic education in medical statistics. It will be assumed, therefore, that the reader already has considerable familiarity with descriptive and inferential statistics and, in addition to knowing about means, medians, variances and standard deviations, has encountered the general framework of hypothesis tests, confidence intervals and so forth and particular applications of them for both continuous and binary outcomes. Suitable texts, in order of increasing difficulty, are those of Campbell and Machin [1], Altman [2] and Fisher and Van Belle [3]. For general advice on clinical trials Pocock [4] or, at a more advanced level, Piantadosi [5] are extremely useful. However, the science of medical statistics is developing rapidly and it may be useful for the trialist to have some overview of the current status and developing trends. This is all that will be attempted in this chapter. More extensive coverage of various statistical issues affecting drug development will be found in Senn [6]. The European Statistical Guideline [7] and International Conference on Harmonisation E9 Guideline are also extremely useful as reminders regarding points which should be covered in any analysis plan.
Research Synthesis Methods, 2011
The late Joe Fleiss was a highly original thinker. To take but one example, a brief letter of his... more The late Joe Fleiss was a highly original thinker. To take but one example, a brief letter of his in Biometrics (Fleiss, 1986), expressing a devastating criticism of the unrealistic models for cross-over trials favored by certain statisticians who prefer ingenuity to judgment, is more important than dozens of papers describing designs to 'deal' with carry-over that have appeared before and since. The third edition (Fleiss et al., 2003) of his classic and important text, Statistical Methods for Rates and Proportions bears many marks of Fleiss's originality; but before discussing the text, the reader is owed some explanation as to why a review is justified of a book that appeared in 2003. In fact, the elapse of time since the third edition is now as great as that between the first (which appeared in 1972) and the second (which appeared in 1980). Indeed, Fleiss died as the book went to press, and it is clear that while keeping much of the spirit of his earlier work, it has benefitted greatly from the energy and sagacity of coauthors Bruce Levin and Myunghee Cho Paik. The reason that a review is justified, despite the progress that continues to be made in the analysis of binary data, is that the text is a classic that remains invaluable as a first port of call when seeking advice on methods for analyzing rates and proportions. Binary data in particular, despite their apparent simplicity, are notorious as a methodological minefield. Not all statisticians will agree that Fleiss is always right but all will surely agree that to consult Fleiss is always valuable. A feature of the book is the way that it emphasizes design as well as analysis and also distinguishes between various designs as regards analyses deemed appropriate. Hence, after introductory chapters covering basics of probability, inference for a single proportion and for a fourfold table, Chapter 4 covers sample size determination and Chapter 5 how to randomize followed by more detailed examinations in Chapters 6-8 of various designs, which while they may all produce fourfold tables are, however, somewhat different and may require different approaches. (Although the extent to which this is considered important continues to be a lively area of statistical dispute.) Designs in which no margins are fixed as well as clinical trials and cohort studies (in which the 'exposure' margins are fixed) and case-control studies, in which the outcome margins are fixed, are covered as are the distinctions between matched and unmatched studies. Chapter 9 covers the comparison of proportions in several independent samples, Chapter 11, logistic and Chapter 12 Poisson regression. The issue of matching and alternatives to it is then picked up in much greater detail in Chapters 13 and 14, which lead on to the more general issue of considering correlated samples, which is covered in Chapter 15. Two chapters on problems of data quality then follow, Chapter 16 is on missing data and Chapter 17 deals with misclassification. Two final chapters pick up more specialist concerns, Chapter 18 is on inter-rater agreement and Chapter 19 on the standardization of rates. The attentive reader will have picked up that I seem to have skipped over Chapter 10 in the above account. This is not because it is missing from my table of contents (however, due to a misprint it is missing from the table in my copy!) but because it is of most relevance to the reader of this journal as it is entitled 'Combining Evidence from Fourfold Tables', and therefore deserves to be considered in more detail. Many matters of importance are covered within this chapter, for example, estimating a common odds ratio using the empirical log-odds transform or the Mantel-Haenszel approach. Also discussed are 'exact' procedures as well as various approaches to testing homogeneity. These methods and other fixed effects methods are not just presented, explained and illustrated but critically discussed. The chapter also has a nice discussion of the
BMJ (Clinical research ed.), Jan 17, 1998
The Lancet, 1998
In the blood: proposed new requirements for registering generic drugs "You can't really say 'simi... more In the blood: proposed new requirements for registering generic drugs "You can't really say 'similar' if it's the same again you want. 'Similar' means something different."-Enderby Outside, Anthony Burgess
Encyclopedia of Statistical Sciences, 2004
Statistics in Medicine, 2002
Statistics in Medicine, 2013
A key paper in modelling patient recruitment in multi-centre clinical trials is that of Anisimov ... more A key paper in modelling patient recruitment in multi-centre clinical trials is that of Anisimov and Fedorov. They assume that the distribution of the number of patients in a given centre in a completed trial follows a Poisson distribution. In a second stage, the unknown parameter is assumed to come from a Gamma distribution. As is well known, the overall Gamma-Poisson mixture is a negative binomial. For forecasting time to completion, however, it is not the frequency domain that is important, but the time domain and that of Anisimov and Fedorov have also illustrated clearly the links between the two and the way in which a negative binomial in one corresponds to a type VI Pearson distribution in the other. They have also shown how one may use this to forecast time to completion in a trial in progress. However, it is not just necessary to forecast time to completion for trials in progress but also for trials that have yet to start. This suggests that what would be useful would be to add a higher level of the hierarchy: over all trials. We present one possible approach to doing this using an orthogonal parameterization of the Gamma distribution with parameters on the real line. The two parameters are modelled separately. This is illustrated using data from 18 trials. We make suggestions as to how this method could be applied in practice.
Statistics in Medicine, 1997
Statistics in Medicine, 1991
The problem of carryover in crossover trials has received a great deal of attention in the statis... more The problem of carryover in crossover trials has received a great deal of attention in the statistical literature. Carryover is just one form of period by treatment interaction; yet a parallel problem of patient by treatment interaction, which may be regarded as dual to that of carryover, has received little attention. We suggest that the phenomenon of patient by treatment interaction requires a repeated measures approach to the analysis of crossover trials. A simple solution using predefined contrasts is presented and illustrated by example.
Statistics in Medicine, 1996
Various aspects of portfolio management and project prioritization within the pharmaceutical indu... more Various aspects of portfolio management and project prioritization within the pharmaceutical industry are examined. It is shown that the cost and probability architecture of a project is a crucial aspect of its value. An appropriate simple tool for ranking projects is the Pearson index. Various difficulties are considered.
Statistics in Medicine, 1983
Statistical regression to the mean predicts that patients selected for abnormalcy will, on the av... more Statistical regression to the mean predicts that patients selected for abnormalcy will, on the average, tend to improve. We argue that most improvements attributed to the placebo effect are actually instances of statistical regression. First, whereas older clinical trials susceptible to regression resulted in a marked improvement in placebo-treated patients, in a modern'series of clinical trials whose design tended to protect against regression, we found no significant improvement (median change 0.3 per cent, p > 0.05) in placebo-treated patients. Secondly, regression can yield sizeable improvements, even among biochemical tests. Among a series of 15 biochemical tests, theoretical estimates of the improvement due to regression by selection of patients as high abnormals (ie. 3 standard deviations above the mean) ranged from 2 5 per cent for serum sodium to 26 per cent for serum lactate dehydrogenase (median 10 per cent); empirical estimates ranged from 3.8 per cent for serum chloride to 37.3 per cent for serum phosphorus (median 9.5 per cent). Thus, we urge caution in interpreting patient improvements as causal effects of our actions and should avoid the conceit of assuming that our personal presence has strong healing powers. KEY WORDS Drug treatment Placebo Statistical regression Computerized medical record Clinical trial Investigators have variously claimed that the placebo is powerful,' that it meters its curative effects in proportion to the severity of the illness' and that it influences both objective and subjective Some authors have advocated a legitimate place for placebo therapy in patient care.' Patients do tend to improve in association with placebo treatment. This association, however, does not by itself prove that the placebo treatment causes the improvement. This paper considers the degree to which statistical regression toward the mean could account for the improvements associated with placebo therapy. We exclude from the scope of our discussion placebo therapy associated with intense conditioning5m6 or body invasion, i.e. needle sticks or surgical incisions. In the first case, the improvements can be attributed to Pavlovian mechanisms and in the second, to neuroendocrine mechanisms.' Most medical prescribing is not associated with either of these circumstances. At the outset, we emphasize that our question regarding the strength of the placebo effect does To whom reprint requests should k addressed.
Statistics in Biopharmaceutical Research, 2011
... 1998) is based on the idea of ROC curves and has been published in the same journal (Brumback... more ... 1998) is based on the idea of ROC curves and has been published in the same journal (Brumback, Pepe, and Alonzo 2006). ... [See Stine and Heyse (2001) in the same jour-nal for some further theoretical development of this idea and also Al-Saleh (2007) who attributed the ...