Statistics from Altmetric.com
Setting the framework: the difference between reliability and agreement
On a daily basis, clinicians and researchers face the challenge of measuring multiple outcomes. From responses to therapies and assessments of disease activity, to certainty of diagnoses and innovation of cutting-edge diagnostic tools, it is essential within every field that outcome measurement be valid, reproducible and reliable.1 At first glance, validity, reproducibility, reliability and agreement may seem similar; however, there are fundamental differences among these concepts that are important for study design and execution, and for methodology and statistical analyses. Alvan Feinstein saw that problem and introduced the term clinimetrics, or, “the methodologic discipline focusing on measurement issues in clinical medicine”.2 The concept of clinimetrics is not new; on the contrary, it has been considered a subset of psychometrics.3 Terwee, de Vet, Mokkink and Knol, among others,4 developed tools to assess and evaluate health measurement instruments in clinical medicine. It is, therefore, why the backbone of this paper will rely on the COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) initiative.
The COSMIN initiative is a multidisciplinary, international consensus which aimed to create standards to evaluate the methodological quality and design and preferred statistical analyses of a study on measurement properties.5 The initiative primarily focused on Health Related Patient-Reported Outcomes (HR-PRO) due to the complexity of these outcomes measurements; however, these concepts still apply to other type of outcomes and will be followed here.4 For reader clarification, HR-PRO is defined by Mokkink et al4 as “any aspect of a patient's health status that is directly assessed by the patient, that is, without the interpretation of the patient's responses by a physician or anyone else”; examples include self-administered or computer-administered questionnaires.
The COSMIN taxonomy in the evaluation of a measurement instrument shows three main quality domains: reliability, validity …