THE subject of pain in children and infants has received considerable attention during the past decade. Pain in neonates and children has historically been underreported, undertreated, and frequently misunderstood. Research comparing analgesic usage between adults and children began to emerge in the 1970s and consistently revealed that children received fewer, less frequent, and smaller doses of potent analgesics. 1–3Recent investigations show limited improvement in prevailing practices in pain management in children 4despite efforts to change the purview and practice of clinicians. Wide variations still exist in practice philosophies in different pediatric centers. 5Although multiple methods have been described to measure and assess pain in children, most are not well-validated and not applicable to all age groups, and none have been universally accepted. Young children and children with cognitive disabilities are especially difficult to evaluate for pain because of their limited understanding and communication skills. Despite these difficulties, measurement of pain in children is of major importance for substantiating a therapeutic decision and evaluating the effectiveness of a particular intervention.

In this issue, Breau et al.  6report about the development and validation of the Non-communicating Children's Pain Checklist–Postoperative Version (NCCPC-PV). This publication is of particular importance because anesthesiologists have had significant difficulties in assessing postoperative pain in children with major cognitive disabilities. The perception of pain includes a sensory component, involving neural pathway activation in response to noxious stimuli, and an affective response, which involves behavioral and cognitive aspects. The International Association for the Study of Pain states that, “Pain is an unpleasant sensory and emotional experience associated with actual or potential tissue damage, or described in terms of such damage.”7Pain is what the subject says hurts. 8But how is pain described and measured in the preverbal or noncommunicating disabled child?

Pain is clearly subjective and no longer considered an experience in which pain intensity is only  proportional to the objective degree of injury. It is certain that the pain experience cannot be separated into its physical and emotional components. The complex response to a noxious stimulus results in behavior that is unique and dependent on multiple factors, including one's experiences and perceptions. Pain experience in children is further complicated by the dynamic evolution of conscious awareness and psychologic and physiologic development. Therefore, pain assessment in children is dependent in large part on their level of understanding as well as their ability to convey to others the magnitude of their experience. Obviously, this ability to convey the existence of pain may be significantly limited in the disabled child.

In general, pain assessment instruments in children can be categorized as observational, self-report, and physiologic instruments. Because of the subjective nature of pain, self-report methods  are considered the best measure of pain in children who are at least 5 or 6 yr old. These methods are less reliable in younger children and children with cognitive disabilities because they rely heavily on visual analogs, sensory associations, and verbal responses. For example, Hester's Poker Chip Tool was validated for children as young as 4 yr of age. However, both sensory and motor responses are required in selecting the “pieces of hurt.”9Therefore, the application of self-report scales is limited to children who can understand the objectives and descriptors of these techniques. Several physiologic parameters  have been used to assess pain in children. These include changes in heart rate or in beat-to-beat heart rate variability, blood pressure, serum cortisol concentrations, transcutaneous oxygen tension, and palmar sweating. 10,11However, physiologic parameters can be influenced by a variety of processes, such as hypoxemia, hypovolemia, and fever, that are unrelated to pain per se . Because of the properties of self-report and physiologic measures, preverbal and cognitively disabled children benefit the most from observational measures. The most valuable observational descriptors  are behaviors such as crying, facial expression, touch behavior, leg position, and general body movements. However, it should be noted that observational measures may be subject to limitations, such as difficulty to separate behavior associated with pain from that caused by fear and anxiety and underestimation of acute postoperative pain. 12 

The observational pain assessment instrument described in the article by Breau et al.  6has to be evaluated in the more general context of a behavioral instrument used in clinical medicine. The development of such instruments in general, as well as those used particularly to assess pain in children, begins with the identification of all descriptors relating to the phenomenon measured. The original publication that described the development of the NCCPC-PV provides extensive content coverage and descriptors of areas related to pain in nonverbal, cognitively impaired children. 13 

The next step involves selecting an appropriate scale of measurement. S. S. Stevens 14–16made in the 1950s a lasting contribution to the classification of scales of measurement. Stevens 14–16was the first to conceive the idea of a measurement system that he designated as the familiar nominal, ordinal, equal interval, and equal ratio scales of measurement. The impact of Stevens’ work has been great enough that the very words that he suggested, nominal, ordinal, equal interval, and equal ratio, find themselves expressed in every major work on research design.

Stevens 14–16also suggested an association between the scale type used and the level of reliability and validity of the phenomenon being measured. He indicated that each successive scale (nominal, ordinal, equal interval, and equal ratio) progresses in complexity, by incorporating the defining feature of each preceding scale and adding its own unique feature. The sequence is as follows: The simplest type of scale, a nominal  scale, contains two or more unordered categories of the entity measured (i.e. , presence or absence of pain). At the next level of scale complexity, there are two or more categories of classification, as in nominal scales, but the defining feature that is added is that the categories are now ordered to form an ordinal  scale (i.e. , slight pain, mild pain, moderate pain, severe pain). The third level of scale complexity would be the equal interval  scale. This scale has three or more categories of classification (the nominal feature), and they are ordered (the ordinal feature), but the defining feature is that one can also identify points on the scale that are equal in interval size. For example, on a 10-point pain scale, a pain experienced as a 7 is to be interpreted as being exactly 3 points less than one of 10. However, the scale does not allow one to conclude that a pain score of 6 refers to twice as much pain as one of 3. The most complex scale, in Stevens’ conceptualization, is the equal ratio  scale of measurement. Such a scale has nominal, ordinal and equal interval features but now incorporates the additional feature of equal ratio categories of classification. For example, on a 10-point pain scale, a score of 2 represents twice as much pain as a score of 1, and a pain score of 9 indicates three times as much pain as a score of 3. The NCCPC-PV developed by Breau et al.  6classifies pain intensity as one of the following:“not at all,”“just a little,”“fairly often,” and “very often.” Therefore, at first look, the NCCPC-PV can be classified as a dichotomous ordinal scale with an absence category followed by two or more categories of degree of presence of pain. 17However, it should be noted that the absence category is used so infrequently that the NCCPC-PV can for all purposes be treated as a continuous ordinal scale with no category of absence.

Interestingly, despite the existence of multiple interval and ratio scales in clinical medicine, most medical decisions are categorical in nature. For example, total cholesterol is measured on an equal ratio scale (e.g. , 300 mg/dl is twice as high as 150 mg/dl). However, this has little meaning medically. That is, a level of 300 mg/dl requires treatment, whereas a level of 150 mg/dl does not require treatment. It is also important to indicate that the same clinical phenomenon can be measured on different types of scales of measurement, depending on the research question. For example, if one needs to correlate cholesterol level with age at first myocardial infarction, one would measure cholesterol on an equal ratio scale. In this case, one would lose information and artificially lower the correlation by measuring cholesterol on an ordinal scale, such as 1 = ideal, 2 = borderline, and 3 = high.

S. S. Stevens also believed that in terms of the scientific quality of the information produced, equal ratio scales were superior to all others, equal interval scales were superior to both ordinal and nominal scales of measurement, and ordinal scales were superior only to nominal scales. The direct implication here is that the degree of reliability and validity of a phenomenon increases as a function of the complexity of the scale type. The issue of reliability and validity as a function the complexity of the scale type is currently a source of some debate. Although some scientists agree with the theory developed by Stevens, 18others strongly disagree. There is considerable research evidence that supports the contrary view to Stevens. For example, a study undertaken by one of the authors and his British colleagues showed that whether psychiatric diagnoses were made on nominal ordinal or equal ratio scales, the levels of interexaminer reliability were essentially interchangeable. 19Similarly, computer simulation research indicates that equal ratio scales are no more reliable than seven-category ordinal scales. 20Therefore, we submit that the decision regarding the type of scale used has to be based solely on the nature of the clinical phenomena assessed.

Deciding on an optimal number of categories for a continuous ordinal scale, such as the NCCPC-PV, is a complex issue that can be dated back to the early work of Symonds, 21almost eight decades ago. Most recently, an extensive computer simulation investigation, and an experimental investigation by Preston and Colman, 20,22indicate that scale reliability tends to increase as the number of categories increases. However, when the number of categories goes beyond seven, there tends to be no material increase in reliability. 20,22The NCCPC-PV can be classified as a four-category ordinal scale. Given the authors’ statement that the absence category is virtually never applicable (at least in their pediatric sample), the NCCPC-PV can be reclassified as a three-category ordinal clinical rating scale. Therefore, a question of whether reliability would increase if the number of the NCCPC-PV categories were to be increased to seven can be raised. This is an intriguing question for Breau et al.  6as well as for the area of pain assessment in children.

Thus far, the discussion has been focused on issues related to the development of a clinical instrument. The last important issue relates to the reliability and validity of the clinical instrument that one develops. Because the appropriate model of the intraclass correlation coefficient (Ri) of chance-corrected agreement can be used with both ordinal and interval data, 23the choice of Breau et al.  6of this statistic, as recommended in Shrout and Fleiss, 24is appropriate. Also, the authors’ calculation of sensitivity and specificity indices for the NCCPC-PV is entirely appropriate. Cicchetti 25has recently published a set of criteria that can be applied to assess the clinical significance of sensitivity and specificity indices of scales of measurement (table 1). These guidelines apply whether the scale consists of nominal, ordinal, or mixed scales, such as the NCCPC-PV. Applying these criteria to the current article, a score of 11 on the NCCPC-PV showed very good sensitivity (88%) and good specificity (81%). Finally, one of the last steps in assuring that the reported sensitivity and specificity of a newly developed clinical instrument have more general validity is to apply the receiver operating characteristic methodology, which provides evidence of the optimal levels and ranges of sensitivity and specificity. 26,27The authors are to be commended for their application of receiver operating characteristic methodology in the development of the NCCPC-PV.

In closing, it should be noted that there are several lessons to be learned from the article of Breau et al.  6First, it is always incumbent on investigators who develop new clinical behavioral instruments to specify the characteristics of the scales of measurement they use, as well as the rationale for the statistics that are used in the assessment of the psychometric properties of their instrument. Second, despite this age of increasing technology and specialization, there is a simultaneous and oppositely motivated increased need for cross-disciplinary collaboration to develop state-of-the-art contributions in pain assessment. Sophistication in assessment design is inherent, but this fact is brought to the forefront when one attempts to develop a valid and reliable pain tool for young and disabled children. Finally, a recent study by Warfield and Kahn 28estimates that more than half of all adult surgical patients experience moderate to severe postoperative pain. It has been stated that accuracy in self-report can improve pain management practices. In disabled children, this option for self-lobbying is almost nonexistent. The American Pain Society in conjunction with the American Academy of Pediatrics states, “Observation of behavior should be used to complement self-report and can be an acceptable alternative when valid self-report is not available.”29Therefore, it is imperative that accurate interpretations of behavior are achieved and assumptions are minimized when validating pain in children. There must be a closing of the gap between clinical assessment and scientific measurement, especially if pain and suffering are to be decimated in the disabled child.

The authors thank Charles Berde, M.D., Ph.D. (Children's Hospital, Boston, Massachusetts), for his helpful comments.

1.
Mather L, Mackie J: The incidence of postoperative pain in children. Pain 1983; 15: 271–82
2.
Eland JM, Anderson JE: The experience of pain in children, Pain: A Sourcebook for Nurses and Other Health Professionals. Edited by Jacox AK. Boston, Little Brown, 1977, pp 453–73
3.
Beyer JE, DeGood DE, Ashley LC, Russell GA: Patterns of postoperative analgesic use with adults and children following cardiac surgery. Pain 1983; 17: 71–81
4.
Gauthier JC, Finley GA, McGrath PJ: Children's self-report of postoperative pain intensity and treatment threshold: Determining the adequacy of medication. Clin J Pain 1998; 14: 116–20
5.
McGrath PJ, Unruh AM: Perioperative and postoperative pain, Pain in Children and Adolescents. Amsterdam, Elsevier Science Publishers, 1987, pp 103–31
6.
Breau LM, Finley GA, McGrath PJ, Camfield CS: Validation of the Non-communicating Children's Pain Checklist–Postoperative Version. A nesthesiology 2002; 96: 528–35
7.
International Association for the Study of Pain, Subcommittee on Taxonomy: Pain terms: A list with definitions and notes on usage. Pain 1979; 6: 249–52
International Association for the Study of Pain, Subcommittee on Taxonomy:
8.
Parkhouse J, Pleurvy BJ, Rees JM: Analgesic Drugs. Oxford, Blackwell Scientific Publications, 1979, p 15
9.
Hester NK: The pre-operational child's reaction to immunization. Nurs Res 1979; 28: 250–4
10.
Anand KJS, Carr DB: The neuroanatomy, neurophysiology, and neurochemistry of pain, stress, and analgesia in Newborns and children. Pediatr Clin North Am 1989; 36: 795–822
11.
Harpin VA, Rutter N: Development of emotional sweating in the newborn infant. Arch Dis Child 1982; 57: 691–5
12.
Lamontagne LL, Hepworth JT, Salisbury MH: Anxiety and postoperative pain in children who undergo major orthopedic surgery. Appl Nurs Res 2001; 14: 119–24
13.
McGrath PJ, Rosmus C, Canfield C, Campbell MA, Hennigar A: Behaviours caregivers use to determine pain in non-verbal, cognitively impaired individuals. Dev Med Child Neurol 1998; 40: 340–3
14.
Stevens SS: On the theory of scales of measurement. Science 1946; 103: 677–80
15.
Stevens SS: Scales of measurement, Handbook of Experimental Psychology. Edited by Stevens SS. New York, Wiley, 1951, pp 23–30
16.
Stevens SS: Measurement, statistics, and the schemapiric view. Science 1968; 168: 849–56
17.
Cicchetti DV: Assessing inter-rater reliability for rating scales: Resolving some basic issues. Br J Psychol 1976; 129: 452–6
18.
Gesheider GA, Bolanowski SJ: Introduction to conference on ratio scaling of psychological magnitudes, Ratio Scaling of Psychological Magnitude. Edited by Bolanowski SJ, Gesheider GA. Hillsdale, New Jersey, Lawrence Erlbaum, 1991, pp 1–7
19.
Remington M, Tyrer PJ, Newson-Smith J, Cicchetti DV: Comparative reliability of categorical and analogue rating scales in the assessment of psychiatric symptomatology. Psychol Med 1979; 9: 765–70
20.
Cicchetti DV, Showalter D, Tyrer P: The effect of number of rating scale categories upon levels of interrater reliability: A monte carlo investigation. Appl Psychol Meas 1985; 9: 31–6
21.
Symonds PM: On the loss of reliability in ratings due to coarseness of the scale. J Exp Psychol 1924; 7: 456–61
22.
Preston CC, Colman AM: Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica 2000; 104: 1–15
23.
Fleiss JL: Statistical methods for rates and proportions, 2nd edition. New York, Wiley, 1981, pp 212–36
24.
Shrout PE, Fleiss JL: Intraclass correlations: Uses in assessing rater reliability. Psychol Bull 1979; 86: 420–8
25.
Cicchetti DV: The precision of reliability and validity estimates re-visited: Distinguishing between clinical and statistical significance of sample size requirements. J Clin Exp Neuropsychol 2001; 23: 695–700
26.
Hsiao JK, Bartko JJ, Potter WZ: Diagnosing diagnoses: Receiver operating characteristic methods and psychiatry. Arch Gen Psychiat 1989; 46: 664–7
27.
Kraemer HC: Assessment of 2 × 2 associations: Generalizations of signal detection methodology. Am Stat 1988; 42: 47–9
28.
Warfield CA, Kahn CH: Acute pain management: Progress in U.S. hospitals and experiences and attitudes among U.S. adults. A nesthesiology 1995; 83: 1090–4
29.
The Assessment and Management of Acute Pain in Infants, Children and Adolescents: A Position Statement from the American Academy of Pediatrics Committee on Psychosocial Aspects of Child and Family Health and American Pain Society Task Force on Pain in Infants, Children, and Adolescents. Glenview, Illinois, American Pain Society, 2001