Article Text
Abstract
Objective To compare four faecal markers for their ability to predict steroid refractoriness in severe paediatric ulcerative colitis (UC). Construct validity and responsiveness to change were also assessed.
Methods This was a prospective multicentre cohort study. Stool samples from 101 children (13.3±3.6 years; Pediatric UC Activity Index (PUCAI) at admission 72±12 points) were obtained at the third day of intravenous steroid therapy. Repeated samples at discharge were obtained from 24 children. Predictive validity was assessed using diagnostic utility statistics to predict steroid failure (ie, the need for salvage treatment). Concurrent validity was assessed using correlational analysis with the following constructs: PUCAI, Lindgren and Seo scores, physician's global assessment, albumin, erythrocyte sedimentation rate and C-reactive protein (CRP). Responsiveness was assessed using test utility and correlational strategies.
Results Median values (IQR) were very high at baseline for all four markers (calprotectin 4215 μg/g (2297–8808); lactoferrin 212 μg/g (114–328); M2-pyruvate kinase (M2-PK) 363 U/g (119–3104); and S100A12 469 μg/g (193–1112)). M2-PK was numerically superior to the other three markers and CRP in predicting response to corticosteroid treatment (area under the receiver operating characteristic (ROC) curve 0.75 (95% CI 0.64 to 0.85; p<0.001) vs <0.65 for the others). However, it did not add to the predictive ability of the PUCAI (area under the ROC 0.81 (95% CI 0.73 to 0.89)). M2-PK also had the highest construct validity but with a modest mean correlation with all constructs (r=0.3; p<0.05). None of the markers was responsive to change (Spearman's rho correlation with change in the PUCAI <0.1; p>0.05, area under the ROC curve <0.65; p>0.05).
Conclusions The four markers were greatly elevated in severe paediatric UC. Only M2-PK had good construct and predictive validity, and none was responsive to change. The PUCAI, a simple clinical index, performed better than the faecal markers in predicting outcome following a course of intravenous corticosteroids in severe UC.
- Ulcerative colitis
- PUCAI
- calprotectin
- pyruvate kinase
- lactoferrin
- clinical decision making
- colorectal surgery
- paediatric gastroenterology
Statistics from Altmetric.com
- Ulcerative colitis
- PUCAI
- calprotectin
- pyruvate kinase
- lactoferrin
- clinical decision making
- colorectal surgery
- paediatric gastroenterology
Significance of this study?
What is already known about this subject?
Faecal markers are increasingly used as objective markers for measuring disease activity in IBD, and predicting response to treatments.
Early identification of patients with acute severe UC who fail steroid treatment is of utmost importance in order to initiate salvage therapy early; several clinical and laboratory markers are in use.
There are no systematic data on the utility of different faecal markers to predict response in acute severe UC.
M2-pyruvate kinase (M2-PK) is a new faecal marker, used primarily to detect colorectal cancer.
What are the new findings?
The new M2-PK faecal marker had a good predictive validity to identify those failing intravenous steroid treatment in severe UC; faecal calprotectin had a fair predictive validity, and S100A12 and lactoferrin had none.
Faecal M2-PK's ability to predict outcome was inferior to that of a simple clinical index based on history taking alone (ie, the PUCAI).
None of the four evaluated faecal marker can be used to measure response to treatment in the severe UC setting.
How might it impact on clinical practice in the foreseeable future?
Physicians should continue using the existing simple clinical indices in the decision making of treatment escalation in acute severe UC and in monitoring response to treatment.
However, when an objective measure for disease activity and for predicting treatment outcome is desired, the M2-PK faecal marker should be preferred (eg, in clinical trials to supplement the clinical measures).
Introduction
Accurate monitoring of disease activity is the mainstay of clinical decision making in severe attacks of ulcerative colitis (UC). The endoscopic appearance of the rectal mucosa is important in assessing the degree of inflammation, but repeated endoscopic evaluations are not feasible, especially in children. Indirect, yet reliable, measures of biological disease activity are, therefore, of utmost importance in managing severe UC. Blood tests, including C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), haemoglobin and albumin, are in common use but have only modest accuracy in reflecting UC disease activity.1
Several predicting variables have been suggested in severe UC to guide introduction of second-line treatment early during the admission in the one-third of patients who will ultimately fail standard corticosteroid treatment.2 CRP and stool frequency are included in two clinical prediction rules.3 4 The Pediatric UC Activity Index (PUCAI) was developed as a non-invasive, 6-item clinical index of disease activity, and proved to be highly predictive of steroid refractoriness in severe paediatric UC.5–8
Faecal calprotectin,9–12 S100A12,13 M2-pyruvate kinase (M2-PK)14 and lactoferrin11 15 have been shown to reflect the severity of mucosal inflammation, but only to a moderate degree (correlation with colonoscopic score range between studies with r ∼0.4–0.6). With the exception of one adult study on calprotectin,16 no data exist on the utility of these faecal markers in severe acute UC. In this prospective multicentre study, we aimed to assess, head-to-head, the construct and predictive validity of four inflammatory faecal markers in a large cohort of children with acute severe UC. We also aimed to measure the responsiveness of these markers to change in disease activity during the admission.
Methods
Setting and patients
This was a planned substudy of a large clinical study assessing the outcome of severe paediatric UC.8 In that prospective multicentre cohort study, children 2–18 years of age, admitted to 10 inflammatory bowel disease (IBD) centres in North America during 2006–2008 were enrolled if the disease was severe enough to trigger intravenous corticosteroid treatment. Children with concurrent enteric infection were excluded. Patients were treated with 1–1.5 mg/kg/day equivalent methylprednisolone dose up to 40–60 mg/day according to the physicians' discretion; no major differences in clinical practice have been found across the participating sites in relation to disease activity at admission and at second-line treatment, or in the time elapsed from admission to discharge or introduction of salvage treatment.8 A diagnosis of UC was established by the presence of accepted clinical, radiological, endoscopic and histological criteria.17 The ethical review board of all participating hospitals approved this study, and informed consent and assent were obtained from participants. The report of the original study included the results of calprotectin at day 3 only.
Explicit clinical, laboratory and radiographic data were collected upon admission and on the third and fifth day of corticosteroid treatment, on introduction of second-line medical treatment or colectomy (if applicable), and at hospital discharge. Disease activity was measured at each of the above time points by the PUCAI, a non-invasive index intended to measure disease activity in UC.5 7 The PUCAI has proved to perform well in the severe end of the disease spectrum and is very responsive to change.6 18 Steroid failure was defined as the need for second-line medical treatment (including infliximab and calcineurin inhibitors) or colectomy by discharge.
Specimens
Stool samples were collected on the third day of intravenous corticosteroid treatment, and kept at −70°C until analysis in a central laboratory (Sydney, Australia). The third day was chosen since the divergence between responders and non-responders increases with time from admission and day 3 has been previously shown to predict treatment outcome optimally.2 Patients from The Hospital for Children, Toronto, also provided samples at hospital discharge or at introduction of second-line treatment, for assessing responsiveness of the faecal markers. Cut-off concentrations of the faecal markers to be considered elevated were used as previously suggested: 100 μg/g for calprotectin,19 7.25 μg/g for lactoferrin,20 21 4 U/g for M2-PK 22 23 and 10 μg/g for S100A12.13
Faecal sample preparation
The manufacturers' directions were followed for faecal extraction of S100A12, calprotectin and lactoferrin (PhiCal test, Nycomed Pharma AS, Oslo Norway), and M2-PK (ScheBo Tumour M2-PK test, ScheBo Biotech, Giessen, Germany). Briefly, ∼100 mg of faecal material was obtained from frozen stool samples. Extraction buffer (Nycomed) was added at a dilution of 1:50, with extraction buffer (ScheBo) for M2-PK at a dilution of 1:100. Extraction buffer was quickly added to minimise thawing. Faeces in extraction buffer were vortexed in a tube with an agitator, before placing on a horizontal platform mixer for 25 min at 120 rpm. Approximately 1.4 ml of homogenate was then transferred to a 2 ml tube and centrifuged at 10 000 g for 20 min. Clear supernatant (1 ml) was then transferred to a new tube and stored at −80°C until assayed.
Calprotectin, S100A12, lactoferrin and M2-PK assays
Calprotectin levels were measured according to the manufacturer's instructions using the PhiCal test (Nycomed), a sandwich ELISA that uses affinity-purified rabbit polyclonal antibodies.24 S100A12 levels was measured by an in-house ELISA using a previously published method.25 This sandwich ELISA utilises rabbit polyclonal antibodies raised against recombinant human S100A12 that have been shown not to bind S100A8, S100A9 or the calprotectin complex. Faecal lactoferrin was assayed according to the manufacturer's instructions using the IBD-SCAN ELISA kit (TechLab, Blacksburg, Virginia, USA), a sandwich ELISA that utilises polyclonal antibodies specific to human lactoferrin. Faecal M2-PK was assayed according to the manufacturer's instructions using the Tumour M2-PK ELISA stool test (ScheBo) that uses two monoclonal antibodies specific for tumour M2-PK.
Analytic approach and statistics
In this study we evaluated three psychometric properties of the contending markers: predictive validity, construct validity and responsiveness.
Predictive validity measures whether an instrument can predict clinically important outcomes. Here it was assessed using diagnostic utility statistics in predicting steroid failure (receiver operating characteristic (ROC) curve, sensitivity, specificity and predictive values). An area under the ROC curve of >0.7 was considered fair, 0.8 good, and >0.9 as having excellent discriminative ability. The 95% CIs were calculated for all point estimates. Logistic regression models were constructed to evaluate the relative significance of each predictor and to test interactions between the variables. In all models, response to corticosteroids (yes/no) was used as the explanatory variable. Selection of the best model was guided by maximising c-statistics.
Construct validity represents various mini-theories that together explain whether a measure acts the way it is expected based on the concept it represents; in this study, ‘disease activity’.26 Typically, when several constructs are being included, each taps a different aspect of the concept, and we thus included several constructs of disease activity: (1) three UC disease activity indices (the PUCAI,5 Lindgren3 and Seo27 indices); (2) physician global assessment of disease activity (measured on a 100 mm visual analogue scale from complete remission to fulminant colitis); and (3) blood test results including albumin, ESR and CRP. Spearman or Pearson correlations were used, as appropriate, for the distribution normality.
Responsiveness refers to the ability of the measure to correctly identify change over time when it occurs.28–30 The short-term responsiveness of the four faecal markers was assessed by comparing the markers' values at the third day of steroid treatment and at discharge or at the time of introduction of the second-line treatment. Change in the markers' values, labelled with the Greek letter Δ, was determined by subtracting the follow-up score from the initial score.
Responsiveness was calculated utilising correlational and anchor-based approaches.29 In the latter, the ROC curve was used to differentiate patients who responded to corticosteroids from those who failed this treatment, while using the area under the curve (AUC) as a summary measure for responsiveness.29 31 The correlational approach utilised correlations between the ΔPUCAI with the Δvalues of the faecal markers.
Data are presented as means±SD, or medians (IQR). Comparisons between groups were made using the non-parametric Wilcoxon rank sum test or Student t test as appropriate for the data normality. Proportions were compared using the χ2 or Fisher exact test as appropriate. All comparisons were made using two-sided significance levels of p<0.05 and performed using SPSS V15.0 (SPSS, Chicago, Illinois, USA).
Results
A total of 101 eligible children provided stool samples at baseline. Of those, 26 (26%) eventually failed steroid treatment and required salvage therapy (22 infliximab, 1 ciclosporin and 3 colectomy) within a median of 10 days (IQR 5–14 days). Those who responded to steroid treatment had shorter duration of disease at admission than non-responders, and thus also younger age (table 1). A total of 99.5% (n=402) of the 404 values at baseline (ie, four markers for each of the 101 included children) had values higher than the upper normal range for each marker, as expected from the severity of the disease (range: 234–32 450 μg/g for calprotectin; 19–1201 μg/g for lactoferrin; 14–9060 μg/g for S100A12; and 1–49 410 U/g for M2-PK (with only two samples <4 U/g)) (table 2).
Predictive validity
Baseline M2-PK and calprotectin were significantly higher in the non-responders compared with the responders; similar differences were not observed for lactoferrin and S100A12 (table 2). In a head-to-head comparison, M2-PK had the highest discriminant validity to predict response to corticosteroids, with an area under the ROC curve of 0.75 (95% CI 0.64 to 0.85), but this was inferior to the predictive validity of the PUCAI (figure 1). The best cut-off value of M2-PK to differentiate the two groups was 400 U/g (OR 6.8 (95% CI 2.1 to 24), sensitivity 79% (95% CI 60% to 92%), specificity 64% (95% CI 58% to 68%), positive predictive value 41% (95% CI 31% to 48%), negative predictive value 91% (95% CI 82%–96%)), positive likelihood ratio 2.2 (95% CI 1.5–3.2), and negative likelihood ration 0.3 (95% CI 0.15–0.7). Higher cut-off values yielded high specificity at the expense of lower sensitivity (figure 2).
No interaction between faecal values and number of daily stools was noted in a regression analysis, indicating that increased diarrhoea was not an effect modifier of the marker's levels (for M2-PK, OR 1.02 (95% CI 0.99 to 1.05); p=0.13). Since CRP and stool frequency are important predictors of steroid refractoriness,2–4 we constructed another model with CRP, stool frequency and M2-PK as the dependent variables. M2-PK and stool frequency were significant predictors for corticosteroid failure (p=0.028 and p=0.007, respectively) while CRP was not (p=0.71). However, the model's c-statistic (which is similar to the area under the ROC curve) was identical to that obtained by M2-PK alone (0.75 (95% CI 0.65 to 0.86)), suggesting that combining these predictors does not increase predictive validity and is thus not justified. This analysis was repeated for calprotectin with identical results (p=0.63 for CRP, p=0.05 for calprotectin and p=0.01 for stool frequency).
Day 3 PUCAI scores were highly predictive of corticosteroid failure in a univariate logistic regression (2.2-fold increased risk of steroid failure for every 10 point increase in the PUCAI score (95% CI 1.55 to 3.1; p<0.001)). Adding the day 3 M2-PK value (or any other faecal marker) did not increase the discriminative power of the model (c-statistics remained stable at 0.81 (95% CI 0.73 to 0.89, p=0.075 for M2-PK and p=0.001 for PUCAI). The interaction term between these variables was insignificant (p=0.198). The mean day 3 PUCAI was lower in those achieving remission by discharge using steroids alone, versus those discharged with disease still mildly active and those requiring second-line treatment (36±19 vs 47±16; p=0.026).
Construct validity
Consistent with the predictive validity findings, M2-PK had the highest construct validity of all four markers, but nonetheless correlations were, at most, only fair (table 3). The values of the other three markers correlated poorly with the constructs of disease activity.
Responsiveness
Twenty-four children provided a second stool sample either at discharge for responders (n=18) or at the introduction of second-line treatment for non-responders (n=6). Reflecting internal validity, the median ΔPUCAI value was 30 points (IQR 15–43) for responders versus 0 points (−5 to 5) in non-responders (p<0.001). However, none of the Δvalues of the four faecal markers had good correlation with the corresponding ΔPUCAI (r=−0.19 for ΔM2-PK, r=0.09 for Δcalprotectin, r=−0.32 for ΔS100A12 and r=0.05 for Δlactoferrin; all p>0.05).
Similarly, the areas under the ROC curves of the Δvalues to differentiate responders from non-responders were poor (the best value was achieved by Δcalprotectin with an AUC of 0.65 (95% CI 0.33 to 0.97)). In comparison, ΔPUCAI had an AUC of 0.88 (95% CI 0.75 to 1.0).
Discussion
The current study presents the first systematic head-to-head comparison of four faecal inflammatory markers in severe acute UC. To determine which of the markers performed best, we assessed three statistical concepts—that is, construct validity, predictive validity and responsiveness. We provide novel data that the new faecal marker M2-PK has an apparent improved predictive and construct validity over the traditional markers. This study was not powered to show whether these differences are statistically significant, but the consistent results in the different analyses and across all constructs of disease activity strengthen this conclusion.
M2-PK is the dimeric form of the enzyme which converts phosphoenolpyruvate to lactate in the last step of the glycolytic pathway, leading to ATP production.32 This dimeric form is overexpressed in proliferating cells, reflecting increased metabolic needs. Faecal M2-PK has been successfully studied as a non-invasive marker to screen for colorectal cancer and adenomas.22 33 Elevated levels of M2-PK have been found in patients with IBD, probably since increased metabolic needs are also present in the inflamed colon.23 S100A12 belongs to the S100 family of calcium-binding proteins specifically expressed by activated granulocytes. It plays an important role in the innate immune response by triggering various inflammatory pathways.34 35 S100A12 has been shown to have high diagnostic utility in differentiating between patients with and without IBD and in assessing disease activity.13 36 37 Calprotectin, a dimer of S100A8–S100A9, is a granulocyte cytosolic protein that exhibits antibacterial and antifungal activity. In a meta-analysis of 30 studies (n=5983), calprotectin assumed sensitivity and specificity of 95% and 91%, respectively, to differentiate patients with IBD from control patients without IBD.19 Calprotectin levels have been shown in multiple studies to correlate with UC disease activity in the rho range of 0.32–0.76.38 Lactoferrin is an iron-binding glycoprotein secreted by most mucosal membranes and is also a main component of neutrophils. Lactoferrin levels correlate with UC disease activity, with rho ranging from 0.35 to 0.61.38
Predictive variables in severe colitis could aid physicians in optimising the introduction of medical treatment in a timely fashion. Over 20 variables were found to predict response to corticosteroids in the adult literature, but only a few were consistently reproduced.2 Three major indices were developed to predict short-term colectomy in hospitalised adults with UC,3 4 39 but in a comparative study, the PUCAI performed best in children.6 8 The PUCAI, calculated on the third and fifth day of corticosteroid treatment, can predict the need for second-line treatment with a high level of accuracy. M2-PK predicted response to corticosteroids, independently of stool frequency, with an accuracy that was nearly as good as the PUCAI, and superior to CRP. Calprotectin also had predictive utility, but only to a moderate degree, similar to a recent report showing an area under the ROC of only 0.65 in differentiating responders from non-responders.16 The multivariable models in the current study suggest that the addition of any faecal marker to the PUCAI, a simple clinical index, adds little to its prediction power.
This is the first study to evaluate the ability of M2-PK to predict clinically important outcomes in IBDs. Calprotectin40–43 and lactoferrin20 have shown some predictive value for clinical recurrence in IBD, more so for calprotectin and in colonic disease.44 We evaluated prediction of response to a course of intravenous corticosteroids in children with severe disease, while most existing literature has evaluated relapse from a state of clinical remission or mild disease.
Head-to-head comparative studies of faecal markers are sparse. Lactoferrin seems to perform as well as calprotectin in differentiating patients with and without IBD, especially in UC.38 45 In two prospective studies, M2-PK was inferior to calprotectin in reflecting clinical and endoscopic disease activity.14 46 S100A12 performed better than calprotectin in reflecting clinical and endoscopic disease activity in a group of adults.36
An important property of markers of disease activity is the ability to follow response to treatment, both in clinical practice and for clinical trials. To the best of our knowledge, this is the first study to evaluate responsiveness of faecal inflammatory markers in the severe colitis setting. In other disease states, calprotectin and lactoferrin both show markedly reduced values after infliximab treatment concomitantly with clinical and endoscopic improvement.21 47 S100A12 levels decreased in 10 children with Crohn's disease who entered remission following successful treatment with nutritional therapy.13 It may be speculated that the poor responsiveness found here in the extreme side of the disease severity stems from the increased concentration of the markers' values with reduced stool frequency by discharge. In addition, it is possible that mucosal healing, which is reflected by the faecal markers, lags behind clinical improvement and, if taken later, this marker would have shown better responsiveness. Regardless of the explanation, faecal markers are not clinically useful in monitoring short-term response to treatment by hospital discharge.
This study compares different faecal markers measured concurrently in acute severe UC. In view of the increasing number of faecal markers, comparative studies are the most efficient way to elucidate the appropriate marker for the appropriate scenario. We show that all four evaluated markers reflected the severity of disease by having very high faecal values. Only M2-PK, however, has sufficient ability to predict treatment outcome during the admission, but it is still inferior to a simple clinical index (ie, the PUCAI).
References
Footnotes
Funding This investigator-initiated study was partially funded from Schering, Canada, the manufacturer of infliximab. Schering was not involved in any part of the study, including design, protocol preparation, study conduct, data processing and analysis or manuscript writing.
Competing interests JH declares receiving research support, consultant and speaking honoria from Centocor Ortho Biotech. WC declares receiving research support and consulting fees from Centocor, and Abbott Labs. TDW declares receiving research support and consulting fees from Scherring Canada.
Ethics approval This study was conducted with the approval of the all eight participating centres.
Provenance and peer review Not commissioned; externally peer reviewed.
Linked Articles
- Digest