Statistical Significance Testing (original) (raw)

The Role of Statistical Significance Testing In Educational Research

1998

The research methodology literature in recent years has included a full frontal assault on statistical significance testing. The purpose of this paper is to promote the position that, while significance testing as the sole basis for result interpretation is a fundamentally flawed practice, significance tests can be useful as one of several elements in a comprehensive interpretation of data. Specifically, statistical significance is but one of three criteria that must be demonstrated to establish a position empirically. Statistical significance merely provides evidence that an event did not happen by chance. However, it provides no information about the meaningfulness (practical significance) of an event or if the result is replicable. Thus, we support other researchers who recommend that statistical significance testing must be accompanied by judgments of the event's practical significance and replicability.

Statistical Significance Testing: A Historical Overview of Misuse and Misinterpretation with Implications for the Editorial Policies of Educational Journals

Statistical significance tests (SSTs) have been the object of much controversy among social scientists. Proponents have hailed SSTs as an objective means for minimizing the likelihood that chance factors have contributed to research results; critics have both questioned the logic underlying SSTs and bemoaned the widespread misapplication and misinterpretation of the results of these tests. The present paper offers a framework for remedying some of the common problems associated with SSTs via modification of journal editorial policies. The controversy surrounding SSTs is overviewed, with attention given to both historical and more contemporary criticisms of bad practices associated with misuse of SSTs. Examples from the editorial policies of Educational and Psychological Measurement and several other journals that have established guidelines for reporting results of SSTs are overviewed, and suggestions are provided regarding additional ways that educational journals may address the problem.

Lack of statistical significance

Psychology in the Schools, 2007

Criticism has been leveled against the use of statistical significance testing (SST) in many disciplines. However, the field of school psychology has been largely devoid of critiques of SST. Inspection of the primary journals in school psychology indicated numerous examples of SST with nonrandom samples and/or samples of convenience. In this article we present an argument against SST and its consequent p values in favor of the use of confidence intervals and effect sizes. Further, we present instances of common errors that impede cumulative knowledge in the literature related to school psychology.

Use of Tests of Statistical Significance and Other Analytic Choices in a School Psychology Journal: Review of Practices and Suggested Alternatives

School Psychology Quarterly, 1998

The use of tests of statistical significance was explored, first by reviewing some criticisms of contemporary practice in the use of statistical tests as reflected in a series of articles in the "American Psychologist" and in the appointment of a "Task Force on Statistical Inference" by the American Psychological Association (APA) to consider recommendations leading to improved practice. Related practices were reviewed in seven volumes of the "School Psychology Quarterly," an APA journal. This review found that some contemporary authors continue to use and interpret statistical significance tests inappropriately. The 35 articles reviewed reported a total of 321 statistical tests for which sufficient information was provided for effect sizes to be computed, but authors of only 19 articles did report various magnitudes of effect indices. Suggestions for improved practice are explored, beginning with the need to interpret statistical significance tests correctly, using more accurate language, and the need to report and interpret magnitude of effect indices. Editorial policies must continue to evolve to require authors to meet these expectations. (Contains 50 references.) (SLD)

The Magical Influene of Statistical Significance

This paper examined 1122 statistical tests found in 55 master theses accredited during 1995-2000 at Mu'tah University. It tried to answer two main questions: First, do researchers still relying on the level of significance (a) as the only criterion to judge the importance of relations and differences? Second, to what extent practical significance can be found along with statistical significance? Results showed that researchers do consider statistical significance as the only criterion to judge the importance of their findings. 74.33% of the statistically significant tests were having a small practical significance, and only 10.27% were oflarge practical significance.

Why We Don’t Really Know What “Statistical Significance” Means: A Major Educational Failure*

The Neyman–Pearson theory of hypothesis testing, with the Type I error rate, α, as the significance level, is widely regarded as statistical testing orthodoxy. Fisher’s model of significance testing, where the evidential p value denotes the level of significance, nevertheless dominates statistical testing practice. This paradox has occurred because these two incompatible theories of classical statistical testing have been anonymously mixed together, creating the false impression of a single, coherent model of statistical inference. We show that this hybrid approach to testing, with its misleading p < α statistical significance criterion, is common in marketing research textbooks, as well as in a large random sample of papers from twelve marketing journals. That is, researchers attempt the impossible by simultaneously interpreting the p value as a Type I error rate and as a measure of evidence against the null hypothesis. The upshot is that many investigators do not know what our most cherished, and ubiquitous, research desideratu —“statistical significance”—really means. This, in turn, signals an educational failure of the first order. We suggest that tests of statistical significance, whether p’s or α’s, be downplayed in statistics and marketing research courses. Classroom instruction should focus instead on teaching students to emphasize the use of confidence intervals around point estimates in individual studies, and the criterion of overlapping confidence intervals when one has estimates from similar studies.

The Roles of Statistical Significance Testing In Research

The research methodology literature in recent years has included a full frontal assault on statistical significance testing. The purpose of this paper is to promote the position that, while significance testing as the sole basis for result interpretation is a fundamentally flawed practice, significance tests can be useful as one of several elements in a comprehensive interpretation of data. Specifically, statistical significance is but one of the three criteria that must be demonstrated to establish a position empirically. Statistical significance merely provides evidence that an event did not happen by chance. However, it provides no information about the meaningfulness (practical significance) of an event or if the result is replicable. Thus, we support other researchers who recommend that statistical significance testing must be accompanied by judgments of the events practical significance and replicability.

LITERATURE REVIEW: BASIC STATISTICS IN EDUCATIONAL RESEARCH

Cognizance Journal of Multidisciplinary Studies (CJMS), 2024

This paper provides a comprehensive exploration of the role and significance of basic statistics in educational research. Drawing upon historical foundations and theoretical frameworks, the study elucidates the evolution of statistical methodologies and their application in understanding complex educational phenomena. Key pioneers such as Francis Galton and Karl Pearson are highlighted for their foundational contributions, laying the groundwork for statistical analysis in education. The paper examines various statistical techniques, including descriptive and inferential statistics, correlation analysis, regression analysis, and analysis of variance (ANOVA), showcasing their relevance in analyzing diverse aspects of teaching, learning, and educational outcomes. Moreover, the integration of statistical methodologies into educational research is examined across domains such as educational management, policy-making, and teaching practice, emphasizing their role in informed decision-making, resource allocation, and policy formulation. Despite the benefits, challenges such as sample bias and misapplication of statistical tests are acknowledged, underscoring the importance of methodological rigor and statistical literacy among researchers and practitioners. Looking ahead, the paper discusses future directions in educational research, including the integration of advanced statistical techniques and efforts to promote statistical literacy. Overall, this paper offers a comprehensive framework for understanding the complexities of basic statistics in educational research, highlighting its transformative potential in driving positive change and innovation within the education sector.

Caveats for using statistical significance tests in research assessments

This paper raises concerns about the advantages of using statistical significance tests in research assessments as has recently been suggested in the debate about proper normalization procedures for citation indicators. Statistical significance tests are highly controversial and numerous criticisms have been leveled against their use. Based on examples from articles by proponents of the use statistical significance tests in research assessments, we address some of the numerous problems with such tests. The issues specifically discussed are the ritual practice of such tests, their dichotomous application in decision making, the difference between statistical and substantive significance, the implausibility of most null hypotheses, the crucial assumption of randomness, as well as the utility of standard errors and confidence intervals for inferential purposes. We argue that applying statistical significance tests and mechanically adhering to their results is highly problematic and detrimental to critical thinking. We claim that the use of such tests do not provide any advantages in relation to citation indicators, interpretations of them, or the decision making processes based upon them. On the contrary their use may be harmful. Like many other critics, we generally believe that statistical significance tests are over- and misused in the social sciences including scientometrics and we encourage a reform on these matters.