Selecting health measurement scales: basic issues for considerations

Manit Srisurapanont

Authors

Manit Srisurapanont Department of Psychiatry, Faculty of Medicine, Chiang Mai University, Thailand

Keywords:

measurement, assessment, reliability, validity

Abstract

At present, people have more needs on health. It is, therefore, necessary for medical professionals to elaborately measure or evaluate health. Health measurement scales for evaluating health abstract outcomes, eg, beliefs, feeling, attitudes, and behavior, has been more and more involved in health assessment. Health professionals, therefore, need competency in selecting health measurement scales, including the feasibility and costeffectiveness of assessment. In doing so, they will be able to convert such abstract outcomes into scores, which can be further computed and communicated. This article describes key issues for consideration in applying health measurement scales in clinical practice or research. Scale users should, fi rstly, consider the scale overview and, then, apply the classical test theory to determine the scale reliability and validity. Knowledge on item response theory can be additionally used with classical test theory and will be helpful to determine item properties. After all properties are taken into account, scale users would be able to choose the scales appropriate for their work.

References

McDowell I. Measuring Health: A Guide to Rating Scales and Questionnaires. 3rd ed. Ox-ford, UK: Oxford University Press; 2006.

Streiner DL, Norman G R, Cairney J. Health Measurement Scales: A Practical Guide to Their Development and Use. 5th ed. Oxford, UK: Oxford University Press; 2015.

Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN Ck-ecklist Manual [Internet]. [cited 2017 Feb 26]. Available from: www.cosmin.nl

Üstün TB, Chatterji S, Kostanjsek N, Rehm J, Kennedy C, Epping-Jordan J, et al. Developing the World Health Organization Disability Assessment Schedule 2.0. Bull World Health Organ. 2010;88:815–23.

Sheehan DV, Lecrubier Y, Sheehan KH, Amorim P, Janavs J, Weiller E, et al. The Mini-International Neuropsychiatric Interview (M.I.N.I.): the development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J Clin Psychiatry. 1998;59 Suppl 20:22–33.

Apgar V. A proposal for a new method of evaluation of the newborn infant. Curr Res Anesth Analg. 1953;32:260–7.

Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606–13.

Myles PS, Troedel S, Boquest M, Reeves M. The pain visual analog scale:is it linear or nonlinear? Anesth Analg. 1999;89:1517–20.

Bernstein IH, Lacritz L, Barlow CE, Weiner MF, DeFina LF. Psychometric evaluation of the Montreal Cognitive Assessment (MoCA) in three diverse samples. Clin Neuropsychol. 2011;25:119–26.

Sijtsma K. On the Use, the Misuse, and the Very Limited Usefulness of Cronbach’s Alpha. Psychometrika. 2009;74:107–20.

Loewenthal K, Eysenck MW. An Introduction to Psychological Tests and Scales. 2nd ed. Philadelphia:Psychology Press; 2001. 184 p.

Cortina JM. What is coefficient alpha? An ex-amination of theory and applications. J Appl Psychol. 1993;78:98-104.

Streiner DL. Starting at the beginning: an introduction to coefficient alpha and internal consistency. J Pers Assess. 2003;80:99–103.

Bravo G, Potvin L. Estimating the reliability of continuous measures with Cronbach’s alpha or the intraclass correlation coeffi cient:to-ward the integration of two traditions. J Clin Epidemiol. 1991;44:381–90.

Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standard-ized assessment instruments in psychology. Psychol Assess. 1994;6: 284–90.

Jakobsson U, Westergren A. Statistical meth-ods for assessing agreement for ordinal data. Scand J Caring Sci. 2005;19:427–31.

Sim J, Wright CC. The kappa statistic in relia-bility studies: use, interpretation, and sample size requirements. Phys Ther. 2005;85:257-68.

Cicchetti DV, Sparrow SA. Developing criteria for establishing interrater reliability of specifi c items: applications to assessment of adaptive behavior. Am J Ment Defi c. 1981;86:127–37.

Keszei AP, Novak M, Streiner DL. Introduction to health measurement scales. J Psychosom Res. 2010;68:319-23.

Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63:737–45.

Ware JE, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30:473–83.

Streiner DL. A checklist for evaluating the usefulness of rating scales. Can J Psychiatry Rev Can Psychiatr. 1993;38:140–8. 23. Fann JR, Bombardier CH, Dikmen S, Esselman P, Warms CA, Pelzer E, et al. Validity of the Patient Health Questionnaire-9 in assessing depression following traumatic brain injury. J Head Trauma Rehabil. 2005;20:501–11.

Hamilton M. A rating scale for depression. J Neurol Neurosurg Psychiatry. 1960;23:56–62.

Gordis L. Epidemiology. 5th ed. Philadelphia, PA: Elsevier Saunders; 2016.

Casey BM, McIntire DD, Leveno KJ. The Con-tinuing Value of the Apgar Score for the As-sessment of Newborn Infants. N Engl J Med. 2001;344:467–71.

Wright KD, Asmundson GJ, McCreary DR, Scher C, Hami S, Stein MB. Factorial validity of the Childhood Trauma Questionnaire in men and women. Depress Anxiety. 2001;13:179–83.

Kay SR, Fiszbein A, Opler LA. The positive and negative syndrome scale (PANSS) for schizo-phrenia. Schizophr Bull. 1987;13:261–76.

Marder SR, Davis JM, Chouinard G. The effects of risperidone on the five dimensions of schizophrenia derived by factor analysis:Combined results of the North American tri-als. J Clin Psychiatry. 1997;58:538–46.

Srisurapanont M, Jarusuraisin N, Jittiwutikan J. Amphetamine withdrawal: I. Reliability, validity and factor structure of a measure. Aust N Z J Psychiatry. 1999;33:89–93.

Wlodyka-Demaille S, Poiraudeau S, Catanza-riti J-F, Rannou F, Fermanian J, Revel M. The ability to change of three questionnaires forneck pain. Jt Bone Spine Rev Rhum. 2004;71: 317–26.

Teresi JA. Different approaches to differential item functioning in health applications. Ad-vantages, disadvantages and some neglected topics. Med Care. 2006;44:S152-170.

Huang FY, Chung H, Kroenke K, Delucchi KL, Spitzer RL. Using the Patient Health Question-naire-9 to measure depression among racially and ethnically diverse primary care patients. J Gen Intern Med. 2006; 21:547–52.

Hays RD, Morales LS, Reise SP. Item response theory and health outcomes measurement in the 21st century. Med Care. 2000;38:II28-42.

Hays RD, Morales LS, Reise SP. Item Response Theory and Health Outcomes Measurement in the 21st Century. Med Care. 2000;38:II28-II42.

Reise SP, Waller NG. Item response theory and clinical measurement. Annu Rev Clin Psychol. 2009;5:27–48.