On the Misuse of Statistics: A Reply to Hirschfeld et al (original) (raw)
Related papers
The harm done by tests of significance
Accident Analysis & Prevention, 2004
Three historical episodes in which the application of null hypothesis significance testing (NHST) led to the misinterpretation of data are described. It is argued that the pervasive use of this statistical ritual impedes the accumulation of knowledge and is unfit for use.
Editorial: Epistemological and Ethical Aspects of Research in the Social Sciences
Frontiers in Psychology
Editorial on the Research Topic Epistemological and Ethical Aspects of Research in the Social Sciences This Research Topic focuses on the questions "behind" empirical research in the social sciences, especially in psychology, sociology and education, and presents various ideas about the nature of empirical knowledge and the values knowledge is or should be based on. The questions raised in the contributions are central for empirical research, especially with respect to disciplinary and epistemological diversity among researchers. This diversity is also mirrored by the variety of article types collected in this issue, "Hypotheses & Theory, " "Methods, " "Conceptual Analyses, " "Review, " "Opinion, " "Commentary, " and "Book Review." Krueger and Heck explore in their "Hypotheses & Theory" article "The Heuristic Value of p in Inductive Statistical Inference." Taking up a very lively debate on the significance of nullhypothesis testing, they explore how well the p-value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. They furthermore investigate the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, they find that the p-value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. Krueger and Heck conclude that despite its general usefulness, the p-value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say. The argumentation of this article is flanked with a "Comment" on the article "The Need for Bayesian Hypothesis Testing in Psychological Science" (Wagenmakers et al., 2017) by Perezgonzalez. He argues that Wagenmakers et al. fail to demonstrate the illogical nature of pvalues, while, secondarily, they succeed to defend the philosophical consistency of the Bayesian alternative. He comments on their interpretation of the logic underlying p-values without necessarily invalidating their Bayesian arguments. A second contribution by Perezgonzalez et al. deals with a comment on epistemological, ethical, and didactical ideas to the debate on null hypothesis significance testing, chief among them ideas about falsificationism, statistical power, dubious statistical practices, and publication bias presented by Heene and Ferguson (2017). The authors of this commentary conclude that frequentist approaches only deal with the probability of data under H0 [p(D|H 0)]. If anything about the (posterior) probability of the hypotheses is at question, then a Bayesian approach is needed in order to confirm which hypothesis is most likely given both the likelihood of the data and the prior probabilities of the hypotheses themselves.
All evidence is equal: the flaw in statistical reasoning
Oxford Review of Education, 2010
In the context of existing 'quantitative'/'qualitative' schisms, this paper briefly reminds readers of the current practice of testing for statistical significance in social science research. This practice is based on a widespread confusion between two conditional probabilities. A worked example and other elements of logical argument demonstrate the flaw in statistical testing as currently conducted, even when strict protocols are met. Assessment of significance cannot be standardised and requires knowledge of an underlying figure that the analyst does not generally have and can not usually know. Therefore, even if all assumptions are met, the practice of statistical testing in isolation is futile. The question many people then ask in consequence iswhat should we do instead? This is, perhaps, the wrong question. Rather, the question could be-why should we expect to treat randomly sampled figures differently from any other kinds of numbers, or any other forms of evidence? What we could do 'instead' is use figures in the same way as we would most other data, with care and judgement. If all such evidence is equal, the implications for research synthesis and the way we generate new knowledge are considerable.
Warfare, Infanticide, and Statistical Inference: A Comment on Divale and Harris
American Anthropologist, 1978
d e p i c t i n g t h e prejudices and inequality which African women have endured in urban a r e a s , a n d t h e overwhelming influence colonialism has had on urban Africans. Reference Cited Little, Kenneth 1976 Women in African Towns South of the Sahara: The Urbanization Dilemma. I n Women and World Development. Irene Tinker and Michele Bo Bramsen, eds. Pp. 78-87. Washington: A m e r i c a n Association for the Advancement of Science.
The Quest for Statistical Significance: Ignorance, Bias and Malpractice of Research Practitioners
https://www.ijrrjournal.com/IJRR\_Vol.5\_Issue.3\_March2018/Abstract\_IJRR0014.html, 2018
There is a growing body of evidence on the prevalence of ignorance, biases and malpractice among researchers which questions the authenticity, validity and integrity of the knowledge been propagated in professional circles. The push for academic relevance and career advancement have driven some research practitioners into committing gross misconduct in the form of innocent ignorance, sloppiness, malicious intent and outright fraud. These, among other concerns around research data handling and reporting, form the basis for this in-depth review. This discourse also draws attention to the recent official statement on the correct use of the p-value and the need for professional intervention is ensuring that the outcomes of research are neither erroneous nor misleading. The expositions in this review express cogent implications for institutions, supervisors, mentors, and editors to promote high ethical standards and rigor in scientific investigations.
Sociological Research Online, 2016
This paper is a response to Gorard's article, ‘Damaging real lives through obstinacy: re-emphasising why significance testing is wrong’ in Sociological Research Online 21(1). For many years Gorard has criticised the way hypothesis tests are used in social science, but recently he has gone much further and argued that the logical basis for hypothesis testing is flawed: that hypothesis testing does not work, even when used properly. We have sympathy with the view that hypothesis testing is often carried out in social science contexts when it should not be, and that outcomes are often described in inappropriate terms, but this does not mean the theory of hypothesis testing, or its use, is flawed per se. There needs to be evidence to support such a contention. Gorard claims that: ‘Anyone knowing the problems, as described over one hundred years, who continues to teach, use or publish significance tests is acting unethically, and knowingly risking the damage that ensues.’ This is a v...