To the Editor:—
The efficacy (the measurable effect) of a treatment and its effectiveness (its utility in routine clinical practice) cannot be simultaneously addressed in a single trial. In the former case, the assay aims at establishing a causal relation between the delivery of a treatment and a measurable effect. This type of trial has been called explanatory . In the latter, the goal is to compare the impact of distinct strategies in the context of routine clinical practice, one of the strategies including the treatment to be evaluated. This kind of trial has been called pragmatic .1The importance of the clear identification of the explanatory versus pragmatic nature of a trial goes far beyond a semantic debate. Indeed, the way the question has been formulated (comparison of treatments or of global healthcare strategies), the experimental approach, the statistical risks allowed, the calculation of the number of patients to be included, and the analysis of the results are all very different between the two types of trials. Interestingly, we are much more familiar with explanatory trials, while the principles and methods of pragmatic ones have been reported more than 40 yr ago.2
In a recent issue of Anesthesiology, Myles et al. 3conducted a large, clinical, multicenter trial on the impact of various intraoperative inspired gas concentrations on a wide range of postoperative complications. The question raised here definitely addresses the impact of two different strategies, i.e. , the use of low and high oxygen inspired concentrations combined with either nitrous oxide or nitrogen, including all related changes (i.e. , differences in the inspired oxygen concentrations used, differences in the inspired concentrations of volatile anesthetics, and so on). The experimental approach used is also that of a pragmatic assay, as assessed by the routine surgical context of the trial, the large inclusion criteria, the randomization, and the detailed therapeutic schemes reported in the Materials and Methods. Surprisingly however, the authors seem to have considerably minimized the consequences of the pragmatic nature of the trial. For example, this essential feature of the work is not mentioned in the title, and only the introduction section contains the word pragmatic . This is still more striking when looking at the statistical risks allowed for calculation of sample size. In the Materials and Methods, the authors explain in detail the choice of a statistical analysis adapted to an explanatory trial. In a pragmatic assay, reduction of the α risk (type I error) is inaccurate, because no preference is given to one of the two strategies if they turn out to be equivalent. The consequence of this is that the value for the α risk is 1, and therefore that no statistical tests are necessary! In a pragmatic assay, it is impossible not to conclude between the two strategies. The β risk (type II error) is therefore 0. Under these conditions, the risk be considered is the γ risk (type III error), which corresponds to the risk of an erroneous conclusion that one strategy is superior to the other (sign error). The probability of a sign error can also be quantified on the basis of the results, especially if the observed differences are small. For example, mentioning that the difference in the durations of hospital stay (the so-called privilege criterion) is “significantly different” between the two strategies is questionable according to a pragmatic approach. Conversely, the fact that the duration of hospital stay is superior in the 70% nitrous oxide–30% oxygen arm is enough to support the choice of the 80% oxygen–20% nitrous oxide strategy for this criterion. However, because the magnitude of the difference is small, the probability of a sign error is close to 0.25 in this case. The same reasoning held for the secondary criteria (postoperative nausea and vomiting, wound infection, respiratory complications, and so on) leads, however, to much stronger results with a minimal risk of sign errors. Finally, the impact of a pragmatic trial is theoretically limited to the context of the recruiting centers, for which cointerventions are comparable and should be explicitly and exhaustively reported. Although the great number of participating centers alleviates this limitation in the current case, the conclusions reported here may not be applicable to centers outside the recruiting hospitals. Only the convergence of additional pragmatic trials performed in contexts different from that of the current trial may provide a rationale for deciding whether intraoperative high nitrous oxide inspired concentrations are to be recommended.
*Beaujon University Hospital, Assistance Publique des Hôpitaux de Paris, Paris 7 University, Clichy, France. email@example.com