Statistics in publishing: the (mis)use of the p-value (part 1) (original) (raw)
PloS one, 2018
P values represent a widely used, but pervasively misunderstood and fiercely contested method of scientific inference. Display items, such as figures and tables, often containing the main results, are an important source of P values. We conducted a survey comparing the overall use of P values and the occurrence of significant P values in display items of a sample of articles in the three top multidisciplinary journals (Nature, Science, PNAS) in 2017 and, respectively, in 1997. We also examined the reporting of multiplicity corrections and its potential influence on the proportion of statistically significant P values. Our findings demonstrated substantial and growing reliance on P values in display items, with increases of 2.5 to 14.5 times in 2017 compared to 1997. The overwhelming majority of P values (94%, 95% confidence interval [CI] 92% to 96%) were statistically significant. Methods to adjust for multiplicity were almost non-existent in 1997, but reported in many articles rely...
2018
For almost a century after its introduction the p-value remains the most frequently used inferential tool of statistical science for steering research in various scientific domains. This ubiquitous powerful statistic itself is now under fire being surrounded by numerous controversies. We here review some of the important papers which highlight the prevailing myths, misunderstandings and controversies about this statistic. We also discuss recent developments made by the American Statistical Association (ASA) in interpreting p-value and guiding researchers to avoid confusion. Our paper is based on a search of selective databases and we do not claim it to be an exhaustive review. It specifically aims to help medical researchers/professionals who have little background of this contentious statistic and have been chasing it indiscriminately in publishing significant findings.
The arbitrary magic of p<0.05: Beyond statistics
Journal of B.U.ON. : official journal of the Balkan Union of Oncology, 2020
Modern research and scientific conclusions are widely regarded as valid when the study design and analysis are interpreted correctly. P-value is considered to be the most commonly used method to provide a dichotomy between true and false data in evidence-based medicine. However, many authors, reviewers and editors may be unfamiliar with the true definition and correct interpretation of this number. This article intends to point out how misunderstanding or misuse of this value can have an impact in both the scientific community as well as the society we live in. The foundation of the medical education system rewards the abundance of scientific papers rather than the careful search of the truth. Appropriate research ethics should be practised in all stages of the publication process.
Misbeliefs and Preconceptions Regarding P-Value
The medical journals are abounding with " P values " and " tests of hypotheses. It is a common practice among medical researchers to quote whether the " test of hypothesis " they carried out is significant or non-significant and many researchers get very enthusiastic when they discover a " statistically significant " finding without really understanding what it means.Additionally, while medical journals are florid with statement such as: " statistical significant " , " unlikely due to chance " , " not significant, " " due to chance " , or notations such as, " P > 0.05 " , " P < 0.05 " , the decision on whether to decide a " test of hypothesis " is significant or not based on P value has generated an intense debate among statisticians. The P value is probably the most ubiquitous and at the same time, misunderstood, misinterpreted, and occasionally miscalculated index.
p-value based statistical significance tests: Concepts, misuses, critiques, solutions and beyond
Computational Ecology and Software, 2022
Free download at: http://www.iaees.org/publications/journals/ces/articles/2022-12(3)/p-value-based-statistical-significance-tests.pdf The p-value is at the heart of statistical significance tests, a very important issue related to the role of statistical inference in advancing scientific discovery. Over the past few decades, p-value based statistical significance tests have been widely used in most statistics-related research papers, textbooks, and all statistical software around the world. Numerous scientists in various disciplines hold the p-value as the gold standard for statistical significance. However, in recent years, the p-value based statistical significance tests have been questioned unprecedentedly, mainly because the paradigm of significance tests is wrong, p-value is too sensitive, p-value is a dichotomous subjective index, and statistical significance is related to sample size, etc. Scientific research can only be falsified, not confirmed. p-value based statistical significance tests are one of the sources of false conclusions and research reproducibility crisis. For this reason, many statisticians advocate to abandon p-value based statistical significance tests and replace them with effect size, Bayesian methods, meta-analysis, etc. Scientific inference that combines statistical testing and multiple types of evidence is the basis for producing reliable conclusions. Reliable scientific inference requires appropriate experimental design, sampling design, and sample size; it also requires full control of the research process. For complex and time-varying problems, the network or systematic methods should be used instead of the reductionist methods to obtain and analyze data. To change the scientific research paradigm, the paradigm of multiple repeated experiments and multi-sample testing should be adopted, and multiple parties should verify each other to improve the authenticity and reproducibility of the results. In addition to writing, publishing and adopting new statistical monographs and textbooks, the most urgent task is to revise and distribute various statistical software in the new versions based on the new statistics for further use. Before the popularization of new statistics, what we can do is to improve data quality, strict p-value levels of statistical significance tests, use more reasonable analysis methods or testing standards, and combine statistical analysis and mechanism analysis, etc.
F1000 - Post-publication peer review of the biomedical literature, 2016
Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so-and yet these misinterpretations dominate much of the scientific Editor's note This article has been published online as supplementary material with an article of Wasserstein RL, Lazar NA. The ASA's statement on p-values: context, process and purpose. The American Statistician 2016.
What’s in a p? Reassessing best practices for conducting and reporting hypothesis-testing research
Journal of International Business Studies
Social science research has recently been subject to considerable criticism regarding the validity and power of empirical tests published in leading journals, and business scholarship is no exception. Transparency and replicability of empirical findings are essential to build a cumulative body of scholarly knowledge. Yet current practices are under increased scrutiny to achieve these objectives. JIBS is therefore discussing and revising its editorial practices to enhance the validity of empirical research. In this editorial, we reflect on best practices with respect to conducting, reporting, and discussing the results of quantitative hypothesis-testing research, and we develop guidelines for authors to enhance the rigor of their empirical work. This will not only help readers to assess empirical evidence comprehensively, but also enable subsequent research to build a cumulative body of empirical knowledge.