但是，对P值的质疑声持续不断。为什么一定是0.05，如果是0.06就没有统计学意义了吗？0.05的设定值是不是偏大了？近年来，最具有代表性的是2016年美国统计协会关于 P值和统计学意义的讨论，以及其2018年在美国统计学杂志发表"move to a world beyond “p < 0.05”，建议放弃P值.此外，DanielBenjamin等人发表论文支持将P值设定为0.005；当代著名流行病学家，《现代流行病学》作者Rothman KJ，建议用置信区间代替P值。
“Thenew guidelines discuss many aspects of the reporting of studies in the Journal, including a requirement to replace P values with estimatesof effects or association and 95% confidence intervals when neither theprotocol nor the statistical analysis plan has specified methods used to adjustfor multiplicity”
The Methodssection of all manuscripts should contain a brief description of sample sizeand power considerations for the study, as well as a brief description of themethods for primary and secondary analyses.
The Methods section of all manuscripts should include adescription of how missing data have been handled. Unless missingness is rare,a complete case analysis is generally not acceptable as the primary analysisand should be replaced by methods that are appropriate, given the missingnessmechanism. Multiple imputation or inverse probability case weights can be usedwhen data are missing at random; model-based methods may be more appropriate when missingness may be informative. For the Journal’s generalapproach to the handling of missing data in clinical trials please see Wareet al (N Engl J Med 2012;367:1353–1354).
所有稿件的“方法部分”应该告知如何处理缺失数据的。除非缺失非常罕见，否则只分析完整信息病例的研究是无法接受的。在这种情况下，应该基于缺失数据的机制来进行数据填补。多重填补或者逆向概率加权法可以用来填补随机缺失数据。如果缺失数据具有一定规律性（比如非随机缺失），应该采用模型的方法来进行填补。如何处理缺失数据，可见2012年本刊的方法学文章Wareet al (N Engl J Med2012;367:1353–1354).
Significance tests should be accompanied by confidence intervals for estimated effect sizes, measures of association, or other parameters of interest. The confidence intervals should be adjusted to matchany adjustment made to significance levels in the corresponding test.
Unless one-sided tests are required by study design, suchas in noninferiority clinical trials, all reported P values should betwo-sided. In general, P values larger than 0.01 should be reported to twodecimal places, and those between 0.01 and 0.001 to three decimal places; Pvalues smaller than 0.001 should be reported as P<0.001. Notable exceptionsto this policy include P values arising from tests associated with stopping rules in clinical trials or from genome-wide association studies.
Results should be presented with no more precision thanis of scientific value and is meaningful given the available sample size. Forexample, measures of association, such as odds ratios, should ordinarily be reported to two significant digits. Results derived from models should be limited to the appropriate number of significant digits.
Original and final protocols and statistical analysis plans (SAPs) should be submitted along with the manuscript, as well as a table of amendments made to the protocol and SAP indicating the date of the change and its content.
The analyses of the primary outcome in manuscripts reporting results of clinical trials should match the analyses prespecified in the original protocol, except in unusual circumstances. Analyses that do not conform to the protocol should be justified in the Methods section of the manuscript. The editors may ask for additional analyses that are not specified in the protocol。
When comparing outcomes in two or more groups in confirmatory analyses, investigators should use the testing procedures specified in the protocol and SAP to control overall type I error — for example, Bonferroni adjustments or prespecified hierarchical procedures. Pvalues adjusted for multiplicity should be reported when appropriate and labeled as such in the manuscript. In hierarchical testing procedures, P values should be reported only until the last comparison for which the P value wasstatistically significant. P values for the first nonsignificant comparison andfor all comparisons there after should not be reported. For prespecified exploratory analyses, investigators should use methods for controlling false discovery rate described in the SAP — for example, Benjamini–Hochberg procedures.
When no method to adjust for multiplicity of inferences or controlling false discovery rate was specified in the protocol or SAP of aclinical trial, the report of all secondary and exploratory endpoints should belimited to point estimates of treatment effects with 95% confidence intervals. In such cases, the Methods section should note that the widths of the intervalshave not been adjusted for multiplicity and that the inferences drawn may notbe reproducible. No P values should be reported for these analyses.
Please see Wanget al (N Engl J Med 2007;357:2189–2194) on recommended methods for analyzing subgroups. When the SAP prespecifies an analysis of certain subgroups, that analysis should conform to the method described in the SAP. Ifthe study team believes a post hoc analysis of subgroups is important, the rationale for conducting that analysis should be stated. Post hoc analyses should be clearly labeled as post hoc in the manuscript.
请注意Wang et al (NEngl J Med 2007;357:2189–2194) 建议的亚组分析方法。当然统计分析计划事先计划进行某一亚组分析的时候，所有的分析应该必须遵从。如果研究团队认为事后有必要进行无设计的亚组分析，那么必须阐明合理的理由，而且在报告中必须说明哪些是事后分析的结果。
Forest plots are often used to present results from ananalysis of the consistency of a treatment effect across subgroups of factorsof interest. Such plots can be a useful display of estimated treatment effects across subgroups, and the editors recommend that they be included for important subgroups. If subgroups are small, however, formal inferences about the homogeneity of treatment effects may not be feasible. A list of P values for treatment by subgroup interactions is subject to the problems of multiplicity and has limited value for inference. Therefore, in most cases, no P values for interaction should be provided in the forest plots.
If significance tests of safety outcomes (when notprimary outcomes) are reported along with the treatment-specific estimates, no adjustment for multiplicity is necessary. Because information contained in thesafety endpoints may signal problems within specific organ classes, the editors believe that the type I error rates larger than 0.05 are acceptable. Editorsmay request that P values be reported for comparisons of the frequency of adverse events among treatment groups, regardless of whether such comparisons were prespecified in the SAP.
When possible, the editors prefer that absolute eventcounts or rates be reported before relative risks or hazard ratios. The goal isto provide the reader with both the actual event frequency and the relative frequency. Odds ratios should be avoided, as they may overestimate the relative risks in many settings and be misinterpreted.
Authors should provide a flow diagram in CONSORT format. The editors also encourage authors to submit all the relevant informationincluded in the CONSORT checklist. Although all of this information may not bepublished with the manuscript, it should be provided in either the manuscriptor a supplementary appendix at the time of submission. The CONSORT statement, checklist, and flow diagram are available on the CONSORT website.
The validity offindings from observational studies depends on several important assumptions,including those relating to sample selection, measured and unmeasured confounding, and the adequacy of methods used to control for confounding. The Methods section of observational studies should describe how these and other relevant issues were managed in the design and analysis.
If an observational study included a prespecified SAP with a description of hypotheses to be tested, a signed and dated version ofthat plan should be included with the manuscript submission. The Journal encourages authors to deposit SAPs for observational studies in one of the online repositories designed for this purpose.
When appropriate, observational studies should use prespecified accepted methods for controlling family-wise error rate or false discovery rate when multiple tests are conducted. In manuscripts reporting observational studies without a prespecified method for error control, summary statistics should be limited to point estimates and 95% confidence intervals.In such cases, the Methods section should note that the widths of the intervalshave not been adjusted for multiplicity and that the inferences drawn from the inferences may not be reproducible. No P values should be reported for these analyses.
If no prespecified analysis plan exists, the Methods section should provide an outline for the planned method of analysis, including
o Eligibility criteria for the selection of cases and method of sampling from the data, with a diagram as appropriate.
o A description of the association or causal effect to be estimated and the rationale for this choice.
o The prespecified method of analysis to draw inference about treatment or exposure effect or association.
Studies reporting the effect of a treatment or exposure should show the distribution of potential confounders and other variables, stratified by exposure or intervention group. When the analysis depends on the confounders being balanced by exposure group, differences between groups shouldbe summarized with point estimates and 95% confidence intervals when appropriate.
Complex models and their diagnostics can often be best described in a supplementary appendix. Authors are encouraged to conduct ananalysis that quantifies potential sensitivity to bias from unmeasured confounding; absent that, authors must provide a discussion of potential biases induced by unmeasured confounders.