Table A1

Levels of evidence

LevelTherapy/prevention aetiology/harmPrognosisDiagnosisDDX/symptom prevalence study
*Homogeneity means a systematic review (SR) that is free of worrisome variations (heterogeneity) in the directions and degrees of results between individual studies. Not all SRs with statistically significant heterogeneity need be worrisome, and not all worrisome heterogeneity need be statistically significant.
†Clinical decision rule. (Algorithms or scoring systems which lead to a prognostic estimation or a diagnostic category.)
§Met when all patients died before the treatment became available, but some now survive on it; or when some patients died before the treatment became available, but none now die on it.
§§Poor quality cohort study: one that failed to clearly define comparison groups and/or failed to measure exposures and outcomes in the same (preferably blinded), objective way in both exposed and non-exposed individuals and/or failed to identify or appropriately control known confounders and/or failed to carry out a sufficiently long and complete follow up of patients. Poor quality case control study: one that failed to clearly define comparison groups and/or failed to measure exposures and outcomes in the same (preferably blinded), objective way in both cases and controls and/or failed to identify or appropriately control known confounders.
§§§Split sample validation is achieved by collecting all the information in a single tranche, then artificially dividing this into “derivation” and “validation” samples.
††An “Absolute SpPin”: a diagnostic finding whose specificity is so high that a Positive result rules in the diagnosis. An “Absolute SnNout”: a diagnostic finding whose Sensitivity is so high that a Negative result rules out the diagnosis.
†††Good reference standards are independent of the test, and applied blindly or objectively to all patients. Poor reference standards are haphazardly applied, but still independent of the test. Use of a non-independent reference standard (where the “test” is included in the “reference”, or where the “testing” affects the “reference”) implies a level 4 study.
**Validating studies test the quality of a specific diagnostic test, based on prior evidence. An exploratory study collects information and trawls the data (for example, using a regression analysis) to find which factors are “significant”
***Poor quality prognostic cohort study: one in which sampling was biased in favour of patients who already had the target outcome, or the measurement of outcomes was accomplished in <80% of study patients, or outcomes were determined in an unblinded, non-objective way, or there was no correction for confounding factors.
****Good follow p in a differential diagnosis study is >80%, with adequate time for alternative diagnoses to emerge (for example, 1–6 months acute, 1–5 years chronic).
1aSR (with homogeneity*) of RCTs (randomised control trial)SR (with homogeneity*) of inception cohort studies; CDR† validated in different populationsSR (with homogeneity*) of level 1 diagnostic studies; CDR† with 1b studies from different clinical centresSR (with homogeneity*) of prospective cohort studies
1bIndividual RCT (with narrow confidence interval)Individual inception cohort study with >80% follow up; CDR† validated in a single populationValidating** cohort study with good††† reference standards; or CDR† tested within one clinical centreProspective cohort study with good follow up****
1cAll or none§All or none case seriesAbsolute SpPins and SnNouts††All or none case series
2aSR (with homogeneity*) of cohort studiesSR (with homogeneity*) of either retrospective cohort studies or untreated control groups in RCTsSR (with homogeneity*) of level >2 diagnostic studiesSR (with homogeneity*) of 2b and better studies
2bIndividual cohort study (including low quality RCT; eg, <80% follow up)Retrospective cohort study or follow up of untreated control patients in an RCT; Derivation of CDR† or validated on split sample§§§ onlyExploratory** cohort study with good††† reference standards; CDR† after derivation, or validated only on split sample§§§ or databasesRetrospective cohort study or poor follow up
2c“Outcomes” research; ecological studies“Outcomes” researchEcological studies
3aSR (with homogeneity*) of case control studiesSR (with homogeneity*) of 3b and better studiesSR (with homogeneity*) of 3b and better studies
3bIndividual case control studyNon-consecutive study; or without consistently applied reference standardsNon-consecutive cohort study or very limited population
4Case series (and poor quality cohort and case control studies§§)Case series (and poor quality prognostic cohort studies***)Case control study, poor or non-independent reference standardCase series or superseded reference standards
5Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles”Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles”Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles”Expert opinion without explicit critical appraisal, or based on physiology, bench research, or “first principles”