Article Text
Statistics from Altmetric.com
Editor,—Geboes et aldescribed a grading system for inflammation in ulcerative colitis and carried out rigorous assessment of the reproducibility of this system (OpenUrlCrossRefPubMedWeb of Science). This is a very useful study which fills a void in the histopathology assessment of ulcerative colitis. However, now that this system has been described, its use in clinical practice and clinical trials needs to be considered.
Many of the features that Geboes et alhave used in their grading system are described as continuous spectra—for example, chronic inflammation assessed from no increase through to marked increase—but are divided into discrete groups (for example, mild, moderate, marked). This means that these features are ordinal categorical variables rather than continuous real numbers—that is, they have a numerically labelled order but the distance between adjacent numbers will not be the same through the whole range and there are no non-integer values.1 The consequences of this are that these grades cannot be used in processes which require continuous variables, such as linear regression.2 The authors already seem to have made this mistake themselves as they give mean grades of the system in table 2 (to two decimal places), when they should have given frequency distribution histograms or possibly median grades with centiles as an indicator of spread. They do not state which method they used to measure the correlation between location of neutrophils in the epithelium and occurrence of crypt destruction, erosions, and ulcerations (table 4 and last paragraph of results section).
The nature of ulcerative colitis as a chronic relapsing condition means that many studies and trials require a measure of inflammatory activity and need to relate this to other measured parameters. It is likely that this new grading system will be used in clinical trials of new treatment regimens. The ordinal categorical properties of the new grading system means that measures such as mean grade should not be used in comparing groups of patients before and after treatment or between groups of patients receiving different treatments.
Reply
Editor,—We appreciate the comments of Dr Cross on our paper in which we presented the results of a reproducibility study of a grading system for inflammation in ulcerative colitis. We agree that certain features used in the grading system in reality present as continuous spectra. Therefore, the scoring system is composed of major grades and subgrades. The features which represent the major grades such as architecture and infiltration of round cells are clearly different from each other. The continuous spectrum exists within the grades, especially for architectural changes and chronic inflammation. Major grades are divided into different subgroups (for example, mild, moderate or diffuse) and these are indeed ordinal categorical variables. The situation is even more complex. Indeed, the inflammatory cell population in the lamina propria is heterogeneous. It includes T and B lymphocytes, plasma cells, and CD68+ monocytes. These cells can synthesise cytokines or immunoglobulins, or express markers such as LFA-1 or ligand-receptor pairs such as CD40-CD40L which might be important for disease activity. In the past it has been shown for instance that there is a correlation between disease activity and immunoglobulin containing cells.1-1 Hence changes in “chronic inflammation” do not have only a continuous spectrum. There are changes in subtypes of cells, and these changes show a continuous spectrum. Analysis of routinely haematoxylin and eosin stained sections is therefore obviously limited. The aim of our study was to construct and evaluate a scoring system which can be applied routinely. In this system, the distinction between the major grades (for example, structural change, chronic inflammatory infiltrate, infiltration of neutrophils in the epithelium, crypt destruction, and erosion or ulceration) is much more important than the subgrades. The differences between these major grades are clearly defined and do not present as a continuous spectrum. A change from one grade to another is a major difference, which can indicate an important effect, while changes within a grade from mild to moderate are far less important. Furthermore, the distinction between active disease (neutrophils and epithelial damage) and inactive disease is clearly defined. For evaluation of neutrophils in the epithelium, the number of crypts involved was counted.
The results of the reproducibility study presented in table 2 as mean grades were meant to show an example of interobserver agreement. Frequency distribution histograms of the same data are available but were not included, perhaps wrongly, because we had to limit the data which were submitted for publication to keep the paper within a reasonable length. The score allows a good comparison for each individual patient as well as a comparison for the major grades and numbers of patients within each grade. The latter allows comparisons between patient groups. The scoring system is under prospective evaluation in clinical trials and has so far been easy to use for routine assessment of microscopic inflammation. The results will be published in due course.
We realise that the distinction between different groups within one grade is not rigorously correct but we still feel that it can be useful, especially as we decided to use the worst aspect for the grading, rather than an average aspect. The correlation between location of neutrophils in the epithelium and occurrence of crypt destruction, erosions, and ulcerations was studied using Spearman's correlation coefficients.
In general, we agree with Dr Cross that a correct scoring system is needed. On the other hand, such a scoring system should be simple and easy to use. We have tried to find a balance between the different needs and have shown that such a system can be applied with fair interobserver agreement.
References
- 1-1.↵