To the Editor: 

I read with interest the recent article by Baker regarding the value of normalizing resident evaluation scores to eliminate individual faculty evaluator bias.1Without unduly undermining the importance of this study, I have concern about the statistical handling of Likert scores. Likert scores were used to create individual faculty member mean scores, faculty score standard deviations, and average resident scores when more than one core competency section was included. The central issue is that Likert scales involve ordinal data, or categories falling in a hierarchy.2Because the numbers in a Likert scale represent verbal statements of rank order (e.g. , 5 = distinctly above peer level), summarizing such ordinal data with a mean value is inappropriate by strict statistical methodology.2Moreover, the intervals between data points on a Likert scale are not necessarily equal or even certain.3To put this in the context of the study, consider this example from the relative performance designation used in the study: a score of “4” is “somewhat above peer level” and a score of “5” is “distinctly above peer level”; however, an average score of “4.5” cannot be said to represent “somewhat-above-peer-level-and-a-half.”4Similarly, on the absolute/anchored competency designation, the difference between a score of “5” (performed in a fully independent manner) and a score of “6” (able to serve as a consultant to other physicians) is not necessarily equivalent to the difference between a score of “2” (needed moderate assistance) and a score of “3” (needed only minimal assistance). It is difficult to determine what, if any, limitation was imposed on the study as a result of this violation of statistical propriety. Nevertheless, although a purist may pine for cleaner data and analysis, this distraction can be mitigated by considering what Stevens wrote in 1946: “for this ‘illegal’ statisticizing there can be invoked a kind of pragmatic sanction: In numerous instances it leads to fruitful results.”5 

I look forward to future contributions from Baker. When I was a fellow his efforts sparked my interest in resident education and continue to do so now.

1.
Baker K: Determining resident clinical performance: Getting beyond the noise. ANESTHESIOLOGY 2011; 115:862–78
2.
Triola MF: Elementary statistics, 5th ed. Reading, MA: Addison-Wesley Publishing Company, Inc., 1992, pp 16–8
3.
Jamieson S: Likert scales: How to (ab)use them. Med Educ 2004; 38:1217–8
4.
Kuzon WM Jr, Urbanchek MG, McCabe S: The seven deadly sins of statistical analysis. Ann Plast Surg 1996; 37:265–72
5.
Stevens SS: On the theory of scales of measurement. Science 1946; 103:677–80