Conditionality principle (original) (raw)
From Wikipedia, the free encyclopedia
The conditionality principle is a Fisherian principle of statistical inference that Allan Birnbaum formally defined and studied in an article in the Journal of the American Statistical Association, Birnbaum (1962).
Informally, the conditionality principle can be taken as the claim that
Experiments which were not actually performed are not relevant to any statistical analysis
and the implicit admonition that unrealized experiments should be ignored: Not included as part of any calculation or discussion of results.
Together with the sufficiency principle, Birnbaum's version of the principle implies the famous likelihood principle. Although the relevance of the proof to data analysis remains controversial among statisticians, many Bayesians and likelihoodists consider the likelihood principle foundational for statistical inference.
Historical background
[edit]
Some statisticians in the mid 20th century had proposed that a valid statistical analysis must include all of the possible experiments which might have been conducted. Perhaps a series of desired experiments that each require some uncertain opportunity in order to carry out. The uncertain factor could be something such as good weather for a timely astronomical observation (the "experiment" being the search of the telescopic image for traces of some type of object), or availability of more data resources, such as the chance of discovery of some new fossil that would provide more evidence to answer a question covered by another paleontological study. Another resource issue might be the need for special access to private data (patients' medical records, for example) from one of several possible institutions, most of which would be expected to refuse permission; the nature of the data that could possibly be provided and the correct statistical model for its analysis would depend of which institution granted access and how it had collected and curated the private data that might become available for a study (technically, in this case the "experiment" has already been conducted by the medical facility, and some other party is analyzing the collected data to answer their own research question).
All these examples illustrate normal issues of how uncontrolled chance determines the nature of the experiment that can actually be conducted. Some analyses of the statistical significance of the outcomes of particular experiments incorporated the consequences such chance events had on the data that was obtained. Many statisticians were uncomfortable with the idea, and tended to tacitly skip seemingly extraneous random effects in their analyses; many scientists and researchers were baffled by a few statisticians' elaborate efforts to consider circumstantial effects in the statistical analysis of their experiments which the researchers considered irrelevant.[_citation needed_]
A few statisticians in the 1960s and 1970s took the idea even further, and proposed that an experiment could deliberately design-in a random factor, usually by introducing the use of some ancillary statistic, h , {\displaystyle \ h\ ,} like the roll of a die, or the flip of a coin, and that the contrived random event could later be included in the data analysis, and somehow improve the inferred significance of the observed outcome. Most statisticians were uncomfortable with the idea, and the overwhelming majority of scientists and researchers considered it preposterous, and continue to the present to refute the idea and to reject any analyses based on it.[_citation needed_]
The conditionality principle is a formal rejection of the idea that "the road not taken" can possibly be relevant: In effect, it banishes from statistical analysis any consideration of effects from details of designs for experiments that were not conducted, even if they might have been planned or prepared for. The conditionality principle throws out all speculative considerations about what might have happened, and only allows the statistical analysis of the data obtained to include the procedures, circumstances, and details of the particular experiment actually conducted that produced the data actually collected. Experiments merely contemplated and not conducted, or missed opportunities for plans to obtain data, are all irrelevant and statistical calculations that include them are presumptively wrong.
The conditionality principle makes an assertion about a composite experiment, E , {\displaystyle \ E\ ,} that can be described as a suite or assemblage of several constituent experiments E h ; {\displaystyle \ E_{h}\ ;} the index h {\displaystyle \ h\ } is some ancillary statistic, i.e. a statistic whose probability distribution does not depend on any unknown parameter values. This means that obtaining an observation of some specific outcome x {\displaystyle \ x\ } of the whole experiment E {\displaystyle \ E\ } requires first observing a value for h , {\displaystyle \ h\ ,} and then taking an observation x h {\displaystyle \ x_{h}\ } from the indicated component experiment E h . {\displaystyle \ E_{h}~.}
The conditionality principle can be formally stated thus:
Conditionality Principle:
If E {\displaystyle \ E\ } is any experiment having the form of a mixture of component experiments E h , {\displaystyle \ E_{h}\ ,} then for each outcome ( E h , x h ) {\displaystyle \ {\bigl (}\ E_{h},x_{h}\ {\bigr )}\ } of E , {\displaystyle \ E\ ,} the evidential meaning of any outcome x {\displaystyle \ x\ } of any mixture experiment E {\displaystyle \ E\ } is the same as that of the corresponding outcome x h {\displaystyle \ x_{h}\ } of the corresponding component experiment E h {\displaystyle \ E_{h}\ } actually conducted, ignoring the overall structure of the mixed experiment; see Birnbaum (1962).
An illustration of the conditionality principle, in a bioinformatics context, is given by Barker (2014).
Example scenario
The ancillary statistic h {\displaystyle \ h\ } could be the roll of die, whose value will be one of h = 1 , … , 6 . {\displaystyle \ h=1,\ldots \ ,\ 6~.} This random selection of an experiment is actually a wise precaution to curb the influence of a researchers' biases, if there is reason to suspect that the researcher might consciously or unconsciously select an experiment that seems like it would be likely to produce data that supports a favored hypothesis. The result of the dice roll then determines which of six possible experiments E 1 , … , E 6 , {\displaystyle \ E_{1},\ldots \ ,\ E_{6}\ ,} is the one actually conducted to obtain the study's data.
Say that the die rolls a '3'. In that case, the result observed for x {\displaystyle \ x\ } is actually x 3 , {\displaystyle \ x_{3}\ ,} the outcome of experminent E 3 . {\displaystyle \ E_{3}~.} None of the other five experiments E 1 , E 2 , E 4 , E 5 , o r E 6 {\displaystyle \ E_{1},E_{2},E_{4},E_{5},~~{\mathsf {or}}~~E_{6}\ } is ever conducted, and none of the other possible results is ever seen, x 1 , x 2 , x 4 , x 5 , o r x 6 , {\displaystyle \ x_{1},x_{2},x_{4},x_{5},~~{\mathsf {or}}~~x_{6}\ ,} that might have been observed if some other number than '3' had come up. The actual observed outcome, x 3 , {\displaystyle \ x_{3}\ ,} is unaffected by any aspect of the other five sub-experiments that were not carried out, and only the procedures and experimental design of E 3 , {\displaystyle \ E_{3}\ ,} the sub-experiment that was conducted to collect the data, x 3 , {\displaystyle \ x_{3}\ ,} had any bearing on the statistical analysis the outcome, regardless of the fact that the experimental designs for the experiments which might have been conducted had been prepared at the time of the actual experiment E 3 , {\displaystyle \ E_{3}\ ,} and might just as likely been performed.
The conditionality principle says that all of the details of E 1 , E 2 , E 4 , E 5 , o r E 6 {\displaystyle \ E_{1},E_{2},E_{4},E_{5},~~{\mathsf {or}}~~E_{6}\ } must be excluded from the statistical analysis of the actual observation x 3 , {\displaystyle \ x_{3}\ ,} and even the fact that experiment 3 was chosen by the roll of a die: Further, none of the possible randomness brought into the outcome by the statistic h {\displaystyle \ h\ } (the dice roll) can be included in the analysis either. The only thing that determines the correct statistics to be used for the data analysis is experiment E 3 , {\displaystyle \ E_{3}\ ,} and the only data to consider is x 3 , {\displaystyle \ x_{3}\ ,} not h = 3 . {\displaystyle \ h=3~.}
- Barker, D. (2014). "Seeing the wood for the trees: Philosophical aspects of classical, Bayesian and likelihood approaches in statistical inference and some implications for phylogenetic analysis". Biology and Philosophy. 30 (4): 505–525. doi:10.1007/s10539-014-9455-x. hdl:10023/6999. S2CID 54867268.
- Berger, J.O.; Wolpert, R.L. (1988). The Likelihood Principle (2nd ed.). Haywood, CA: The Institute of Mathematical Statistics. ISBN 978-0-940600-13-3.
- Birnbaum, A. (1962). "On the foundations of statistical inference". Journal of the American Statistical Association. 57 (298): 269–326. doi:10.2307/2281640. JSTOR 2281640. MR 0138176. With discussion.
- Kalbfleisch, J.D. (1975). "Sufficiency and conditionality". Biometrika. 62 (2): 251–259. doi:10.1093/biomet/62.2.251.