|LETTER TO EDITOR
|Year : 2022 | Volume
| Issue : 1 | Page : 147-148
P values need to be correctly understood and read along with 95% confidence intervals
Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru, Karnataka, India
|Date of Submission||27-Jan-2022|
|Date of Decision||28-Jan-2022|
|Date of Acceptance||01-Feb-2022|
|Date of Web Publication||31-Mar-2022|
Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru - 560 029, Karnataka
Source of Support: None, Conflict of Interest: None
|How to cite this article:|
Andrade C. P values need to be correctly understood and read along with 95% confidence intervals. Cancer Res Stat Treat 2022;5:147-8
In the previous issue of the journal, Darling provided a scholarly discussion on the P value in medical research. The article examined the origins of the concept of P as a measure of statistical significance, P in the context of difficulties in the replication of research findings, challenges to the use of P in research, interpretation of the strength of evidence based on P, and statistical alternatives to P. In this context, I believe that readers should also understand how to correctly interpret P and how to view P in the context of a 95% confidence interval (CI).
Most students, if asked, will explain P < 0.05 as, “We are 95% certain that the finding is true,” or “The probability is 5% or less that the finding is due to chance”. These explanations are incorrect. There are other misconceptions, too, about the P value. The correct explanation is that when the null hypothesis is true, P < 0.05 means that if the study is repeated in an identical fashion a large number of times, a result as or more extreme than that obtained will be observed on fewer than 5% of occasions.
Here is the explanation in simple English. In a hypothetical study, we find that a drug outperforms placebo by, say, 3 points on a visual analog pain scale. The t-test for drug versus placebo yields a P value of 0.01. This means that if the drug actually has no analgesic action (the null hypothesis is true), and if we repeat the study a large number of times, then the drug will outperform placebo by 3 or more points on only 1% of occasions. Stated more simply, if the drug is ineffective, the probability that we will observe an “efficacy” of 3 or more points is only 1%. In other words, our finding is “rare”.
If the finding is rare or unlikely, and we obtained it the very first time that we conducted the study, the finding may not really be rare, after all. We concluded that the finding was rare only because we assumed that the null hypothesis was true. Hence, the null hypothesis may not be true. That is, the drug is effective as an analgesic as determined by the visual analog pain test.
Most students will need to read this explanation more than once and return to it later to reinforce the learning. However, this needs to be done for a correct understanding of the meaning of the P value. As an additional note, 5% is set as the usual threshold for statistical significance only by convention; other values may also be set, depending on the context.
The P value, when dichotomized into “statistically significant” (P < 0.05) versus “statistically non-significant” (P > 0.05), is useful when policy decisions about the study findings need to be taken; otherwise, it is illogical that a value of P = 0.049 is deemed to be significant, whereas the almost identical value of P = 0.051 is deemed non-significant. In this context, viewing study findings as a 95% CI is far more useful than interpreting study findings based on a P value.
Consider the situation where “Drug outperforms placebo by 2.0 points (P = 0.051).” We would, by convention, consider the drug ineffective because P was > 0.05. However, if we restated our findings as “Drug outperforms placebo by 2.0 (95% CI, -0.1 to 4.1) points; P = 0.051,” our understanding is:
- In our sample, drug outperformed placebo by 2.0 points; the finding was not statistically significant
- We are 95% confident that the population mean for drug versus placebo lies somewhere between −0.1 and +4.1.
In other words, although the finding missed statistical significance, if we repeat our study many times, on most of the occasions, we will find that drug is superior to placebo by up to as much as 4.1 points, and on very, very few occasions, drug will be inferior to placebo by up to a paltry 0.1 points. This clearly puts the solid advantage of the drug in clinical perspective although the advantage did not reach the arbitrary threshold for statistical significance. Since statistical significance is an artificial construct and because the 95% CI tells us what is likely to be true in the population, use of the 95% CI is clearly advantageous. There is a great deal more that the 95% CI tells us; in fact, we can even deduce statistical significance or non-significance from the 95% CI. However, a discussion on 95% CI is beyond the scope of this letter.
In summary, the P value conveys information that is easy to base decisions on, but the 95% CI conveys information that researchers and readers will truly find useful. Readers are referred elsewhere for a more detailed discussion on the P value, and on the 95% CI and the information that can be extracted from it.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Darling HS. To “P” or not to “P”, that is the question: A narrative review on P
value. Cancer Res Stat Treat 2021;4:756-62. [Full text]
Goodman S. A dirty dozen: Twelve P
value misconceptions. Semin Hematol 2008;45:135-40.
Andrade C. The P
value and statistical significance: Misunderstandings, explanations, challenges, and alternatives. Indian J Psychol Med 2019;41:210-5.
] [Full text]
Andrade C. A primer on confidence intervals in psychopharmacology. J Clin Psychiatry 2015;76:e228-31.