



LETTER TO EDITOR 

Year : 2022  Volume
: 5
 Issue : 1  Page : 147148 

P values need to be correctly understood and read along with 95% confidence intervals
Chittaranjan Andrade
Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru, Karnataka, India
Date of Submission  27Jan2022 
Date of Decision  28Jan2022 
Date of Acceptance  01Feb2022 
Date of Web Publication  31Mar2022 
Correspondence Address: Chittaranjan Andrade Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru  560 029, Karnataka India
Source of Support: None, Conflict of Interest: None
DOI: 10.4103/crst.crst_54_22
How to cite this article: Andrade C. P values need to be correctly understood and read along with 95% confidence intervals. Cancer Res Stat Treat 2022;5:1478 
In the previous issue of the journal, Darling provided a scholarly discussion on the P value in medical research.^{[1]} The article examined the origins of the concept of P as a measure of statistical significance, P in the context of difficulties in the replication of research findings, challenges to the use of P in research, interpretation of the strength of evidence based on P, and statistical alternatives to P. In this context, I believe that readers should also understand how to correctly interpret P and how to view P in the context of a 95% confidence interval (CI).
Most students, if asked, will explain P < 0.05 as, “We are 95% certain that the finding is true,” or “The probability is 5% or less that the finding is due to chance”. These explanations are incorrect. There are other misconceptions, too, about the P value.^{[2]} The correct explanation is that when the null hypothesis is true, P < 0.05 means that if the study is repeated in an identical fashion a large number of times, a result as or more extreme than that obtained will be observed on fewer than 5% of occasions.
Here is the explanation in simple English. In a hypothetical study, we find that a drug outperforms placebo by, say, 3 points on a visual analog pain scale. The ttest for drug versus placebo yields a P value of 0.01. This means that if the drug actually has no analgesic action (the null hypothesis is true), and if we repeat the study a large number of times, then the drug will outperform placebo by 3 or more points on only 1% of occasions. Stated more simply, if the drug is ineffective, the probability that we will observe an “efficacy” of 3 or more points is only 1%. In other words, our finding is “rare”.
If the finding is rare or unlikely, and we obtained it the very first time that we conducted the study, the finding may not really be rare, after all. We concluded that the finding was rare only because we assumed that the null hypothesis was true. Hence, the null hypothesis may not be true. That is, the drug is effective as an analgesic as determined by the visual analog pain test.
Most students will need to read this explanation more than once and return to it later to reinforce the learning. However, this needs to be done for a correct understanding of the meaning of the P value. As an additional note, 5% is set as the usual threshold for statistical significance only by convention; other values may also be set, depending on the context.^{[3]}
The P value, when dichotomized into “statistically significant” (P < 0.05) versus “statistically nonsignificant” (P > 0.05), is useful when policy decisions about the study findings need to be taken; otherwise, it is illogical that a value of P = 0.049 is deemed to be significant, whereas the almost identical value of P = 0.051 is deemed nonsignificant. In this context, viewing study findings as a 95% CI is far more useful than interpreting study findings based on a P value.
Consider the situation where “Drug outperforms placebo by 2.0 points (P = 0.051).” We would, by convention, consider the drug ineffective because P was > 0.05. However, if we restated our findings as “Drug outperforms placebo by 2.0 (95% CI, 0.1 to 4.1) points; P = 0.051,” our understanding is:
 In our sample, drug outperformed placebo by 2.0 points; the finding was not statistically significant
 We are 95% confident that the population mean for drug versus placebo lies somewhere between −0.1 and +4.1.
In other words, although the finding missed statistical significance, if we repeat our study many times, on most of the occasions, we will find that drug is superior to placebo by up to as much as 4.1 points, and on very, very few occasions, drug will be inferior to placebo by up to a paltry 0.1 points. This clearly puts the solid advantage of the drug in clinical perspective although the advantage did not reach the arbitrary threshold for statistical significance. Since statistical significance is an artificial construct and because the 95% CI tells us what is likely to be true in the population, use of the 95% CI is clearly advantageous. There is a great deal more that the 95% CI tells us; in fact, we can even deduce statistical significance or nonsignificance from the 95% CI.^{[4]} However, a discussion on 95% CI is beyond the scope of this letter.
In summary, the P value conveys information that is easy to base decisions on, but the 95% CI conveys information that researchers and readers will truly find useful. Readers are referred elsewhere for a more detailed discussion on the P value,^{[3]} and on the 95% CI and the information that can be extracted from it.^{[4]}
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Darling HS. To “P” or not to “P”, that is the question: A narrative review on P value. Cancer Res Stat Treat 2021;4:75662. [Full text] 
2.  Goodman S. A dirty dozen: Twelve P value misconceptions. Semin Hematol 2008;45:13540. 
3.  Andrade C. The P value and statistical significance: Misunderstandings, explanations, challenges, and alternatives. Indian J Psychol Med 2019;41:2105. [ PUBMED] [Full text] 
4.  Andrade C. A primer on confidence intervals in psychopharmacology. J Clin Psychiatry 2015;76:e22831. 
