Article Text

other Versions


Original article
Proteomic analysis of ascending colon biopsies from a paediatric inflammatory bowel disease inception cohort identifies protein biomarkers that differentiate Crohn's disease from UC
  1. Amanda E Starr1,
  2. Shelley A Deeke1,
  3. Zhibin Ning1,
  4. Cheng-Kang Chiang1,
  5. Xu Zhang1,
  6. Walid Mottawea1,2,
  7. Ruth Singleton3,
  8. Eric I Benchimol3,4,5,
  9. Ming Wen1,
  10. David R Mack3,4,
  11. Alain Stintzi1,
  12. Daniel Figeys1,6
  1. 1Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
  2. 2Department of Microbiology and Immunology, Mansoura University, Mansoura, Egypt
  3. 3Children's Hospital of Eastern Ontario (CHEO) Inflammatory Bowel Disease Centre and CHEO Research Institute, University of Ottawa, Ottawa, Ontario, Canada
  4. 4Department of Pediatrics, University of Ottawa, Ottawa, Ontario, Canada
  5. 5School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa, Ottawa, Ontario, Canada
  6. 6Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, Ontario, Canada
  1. Correspondence to Dr Daniel Figeys, Department of Biochemistry, Microbiology and Immunology, Ottawa Institute of Systems Biology, University of Ottawa, Roger Guidon Hall, 451 Smyth Road, University of Ottawa, Ottawa, Ontario, Canada K1H 8M5; dfigeys{at}


Objective Accurate differentiation between Crohn's disease (CD) and UC is important to ensure early and appropriate therapeutic intervention. We sought to identify proteins that enable differentiation between CD and UC in children with new onset IBD.

Design Mucosal biopsies were obtained from children undergoing baseline diagnostic endoscopy prior to therapeutic interventions. Using a super-stable isotope labeling with amino acids in cell culture (SILAC)-based approach, the proteomes of 99 paediatric control and biopsies of patients with CD and UC were compared. Multivariate analysis of a subset of these (n=50) was applied to identify novel biomarkers, which were validated in a second subset (n=49).

Results In the discovery cohort, a panel of five proteins was sufficient to distinguish control from IBD-affected tissue biopsies with an AUC of 1.0 (95% CI 0.99 to 1.0); a second panel of 12 proteins segregated inflamed CD from UC within an AUC of 0.95 (95% CI 0.86 to 1.0). Application of the two panels to the validation cohort resulted in accurate classification of 95.9% (IBD from control) and 80% (CD from UC) of patients. 116 proteins were identified to have correlation with the severity of disease, four of which were components of the two panels, including visfatin and metallothionein-2.

Conclusions This study has identified two panels of candidate biomarkers for the diagnosis of IBD and the differentiation of IBD subtypes to guide appropriate therapeutic interventions in paediatric patients.

  • IBD

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Significance of this study

What is already known on this subject?

  • Accurate diagnosis of Crohn's disease (CD) and UC is important, so appropriate therapy can be initiated to reduce disease progression, complications of disease related to permanent bowel damage and avoid unnecessary adverse drug events.

  • Differentiation through clinical symptoms, site of disease, presence of granulomas, genetics and current serological tests each have limitations.

  • Proteomic analyses are powerful tools to assess quantitative and qualitative data about biological systems.

What are the new findings?

  • An unbiased proteomics screen of biopsies from an inception cohort of patients with IBD led to the identification of a panel of proteins that can be used to differentiate patients with CD from those with UC.

  • Candidate biomarkers suggest elevated fatty acid metabolism in paediatric CD and elevated protein metabolism in paediatric UC.

How might it impact on clinical practice in the foreseeable future?

  • The utilisation of these protein biomarkers into the clinical setting could improve accuracy of the current methods used to differentiate subtypes of IBD and do so in a rapid time frame. Further understanding of the pathogenesis of paediatric IBD may also be realised.


The IBDs are chronic, relapsing and remitting, GI inflammatory conditions that affect over 4 million people worldwide.1 In Canada, over 230 000 individuals suffer from IBD, with an additional 10 000 cases diagnosed each year.2 IBD includes Crohn's disease (CD) and UC, which differ in clinical presentation and complications and may have different responses to treatment.3–5 Paediatric patients, 10%–25% of the IBD population,6 ,7 often present with different features from adult patients with IBD, and have more aggressive and extensive disease and more long-term severe outcomes.4–8 Fewer children have the so-called classical symptoms, and instead have a range of presenting features including atypical symptoms such as short stature or weight loss leading to delayed recognition and diagnosis.9 Generally, CD is characterised by discreet mucosal and submucosal lesions, which can occur anywhere throughout the GI tract, and are transmural in nature. In contrast, UC extends proximally from the rectum, with contiguous, but generally superficial inflammation. However, non-specific gastric involvement, relative rectal sparing and periappendiceal patch involvement in UC10 can add to diagnostic confusion and misclassification of disease. Differential diagnosis is important for the induction and maintenance of appropriate therapy, which can differ between CD and UC. In the paediatric population, exclusive enteral nutrition is an effective induction therapy in CD but not in UC.11 For maintenance therapy, methotrexate is established for CD,12 but not for UC,13 whereas mesalamine is not.14 Furthermore, complications related to the diseases are very different. For those unfortunate patients who do not respond to therapy, surgical options are very different with surgery for UC offering the end of the UC disease process and the opportunity for reservoir formation and reconstitution of the bowel continuity. For CD, with the very real possibility of recurrence elsewhere in the GI tract, surgery to remove inflamed segments of inflamed mucosa is generally not offered unless disease complications necessitate such action. Thus, for both consideration of therapeutic options and discussion of outcomes, correct diagnosis is an important factor in effective treatment in IBD.

Genome-wide association studies have identified nearly 170 susceptibility loci associated with IBD, including many loci that overlap in CD and UC,15–17 yet disease manifestation due to these variants is less than 15%.17 Biomarkers have been sought to complement current IBD diagnostic tools such as endoscopy, imaging and histology to reduce ambiguous diagnosis of IBD, assist in subtype differentiation and provide objective measures of disease. Previous studies have identified proteins that are elevated and measurable in serum or stool;3 however, these proteins have been found to perform best in the more obvious cases of CD or UC in the paediatric population.18 ,19 Serum-detected antibodies directed against neutrophil or bacterial components tend to have low sensitivities (true positive rate <50%; reviewed20). Other biomarkers, namely faecal calprotectin, are clinically useful to identify patients with IBD from populations without mucosal inflammation (eg, IBS, healthy controls) but cannot differentiate IBD subtypes.21 Furthermore, faecal calprotectin is not sufficient to distinguish between mild, moderate or severe disease,19 which is important in deciding appropriate therapeutic intervention. Considering the limitations of current genetic and protein markers, atypical presentations and progression of IBD in the paediatric population, there is a clear need for new biomarkers and approaches that can rapidly and accurately provide diagnosis of CD and UC.

To identify novel protein biomarkers, proteomic approaches have been applied to serum, cell or tissue isolates from adult patients with IBD.22 Findings of these early studies indicate the usefulness of proteomics in biomarker selection, but have been limited by the lack of an inception cohort and differences in patient therapies23–28 and the inability to identify the differentiating proteins (use of MALDI/surface-enhanced laser desorption/ionization (SELDI)),25 ,26 ,29 or are low throughput (two-dimensional gels).24 ,27 ,30 In this study, we overcame these limitations by employing a quantitative, high-throughput proteomic approach to evaluate and compare ascending colonic biopsies from 99 (39 control, 30 CD, 30 UC) therapy-naive children at the time of diagnosis with IBD. We conducted the first proteomic analysis of paediatric biopsies, and aimed to identify candidate biomarkers to contribute to the accurate subdiagnosis of IBD.

Material and methods

Study design

We performed a cross-sectional study, the design for which is outlined graphically in online supplementary figure S1. The patient biopsies, for which the variance of the proteomic data were <2.0, were randomly divided into equal groups between the discovery and the validation phase using a balanced stratification approach for gender and diagnosis (Etcetera in WinPepi, The study was approved by the Children's Hospital of Eastern Ontario (CHEO) Research Ethics Board. Study data were collected and managed using Research Electronic Data Capture (REDCap), hosted at the CHEO Research Institute. REDCap is a secure, web-based application designed to support data capture for research studies.31

Study population

All patients under 18 years of age and scheduled to undergo diagnostic colonoscopy between October 2011 and February 2015 were considered eligible for recruitment. Exclusion criteria, related to conditions known to affect the intestinal microbiome and mucosal gene expression, included (1) a body mass index >95th percentile for age; (2) diabetes mellitus (insulin and non-insulin-dependent); (3) infectious gastroenteritis within the preceding 2 months or (4) use of any antibiotics, probiotics or immunomodulatory agents within the preceding 4 weeks. All IBD cases met the standard diagnostic criteria for either UC or CD following thorough clinical, microbiological, endoscopic, histological and radiological evaluation.32 Phenotyping of disease was based on endoscopy and radiological findings and based on the Paris modification of the Montreal Classification for IBD.33 Clinical disease activity of CD or UC was determined using the Pediatric Crohn's Disease Activity Index (PCDAI)34 or the Pediatric UC Activity Index (PUCAI),35 respectively. Since ascending colon and terminal ileum are the most common sites of CD, and pancolitis dominates in children with UC,36 the ascending colon was chosen as the site for mucosal biopsy to eliminate the region of the bowel biopsied as a confounder. As such, patients with only IBD from whom affected/inflamed ascending colon (CoA) biopsies were obtained were included in the proteomic study. When possible, biopsies in these patients were also obtained from non-inflamed ascending colon (CoN). In contrast, only non-inflamed biopsies were available from control patients, who had macroscopically and histologically normal mucosa, and did not carry a diagnosis for any known chronic intestinal disorder.

Sample collection and processing

Detailed methodologies are provided in the online supplementary materials and methods. Briefly, frozen biopsies were lysed by mechanical homogenisation and proteins isolated following centrifugation. Sample protein was combined with an equal amount of isotopically labelled reference protein lysate to permit for relative quantification of proteins. Tryptic digestion of proteins was performed with filter-aided sample preparation,37 and resulting peptides analysed on an Orbitrap Elite mass spectrometer (MS).

Statistical analysis and validation

All MS raw files were analysed in a single run with MaxQuant V.1.5.1, against the human Uniprot database (downloaded 2014/07/11). Data filtering and statistical analysis were performed in Perseus, Excel (Microsoft) and Prism (Graphpad). Quality assessment of MS data was performed (see online supplementary methods), and outliers determined using Robust regression and Outlier38 were removed from the study.

Proteins quantified by ≥2 unique peptides in ≥95% of the biopsies (Q95) were determined, as were proteins considered to be subgroup specific due to the over-representation in one subgroup (>70% of subgroup biopsies) when compared with at least one other subgroup (<50% of subgroup biopsies). To limit the effects due to imputation of missing data, only the data from the Q95+subgroup-specific proteins were used in principal component analysis (PCA; Matlab) and candidate biomarker identification.

Mathematical models for disease classification (control vs IBD CoA; CD CoA vs UC CoA) were developed in Receiving Operating Characteristics (ROC) Curve Explorer and Tester (ROCCET)39 using proteomic data from a subset of the patients (discovery cohort), and the models substantiated with data from the remaining patients (validation cohort). Specifically, after cohort allocation, the Q95 and subgroup-specific proteins of the discovery cohort were determined and the associated proteomic data evaluated by Partial Least Squares Discriminant Analyses (PLSDA), Support Vector Machine (SVM) and Random Forest (RF) in the Explorer module of ROCCET.39 Ultimately, proteins commonly identified in all three models were considered to be candidate biomarkers, and then ranked by the respective Area Under the ROC Curve (AUC) value. Biomarker panels were developed in the Tester module of ROCCET by iterative analysis with a PLSDA model using a step-forward method, with candidate biomarkers added by protein-specific AUC values (starting with the highest). The minimal number of proteins selected for inclusion in the panel was based upon the point of plateau for the AUC, to balance specificity and sensitivity. Biomarker panels were independently validated by applying the validation cohort data to the discovery-trained PLSDA models.

PCDAI or PUCAI scores in the discovery cohort were compared with all proteins in the cohort Q95+subgroup-specific proteins to determine the Pearson correlation (Graphpad, Prism). Pathway analyses were performed using DAVID ( and Panther ( and visualised with iPATH2 interactive pathways explorer ( using Uniprot accession numbers. Enzyme-linked immunosorbent assays (ELISAs) for visfatin (Ezno Life Sciences, New York, USA) and metallothionein-2 (MT2) (Cloud-Clone, Texas, USA) were performed as per the manufacturer’s protocol on biopsy lysate diluted to a final sodium dodecyl sulfate (SDS) concentration of 0.08%.


Descriptive characteristics

Biopsies from the ascending colon of 100 children undergoing diagnostic colonoscopy were obtained (see online supplementary table S1). The mean age at endoscopy in patients with IBD (n=61) was 13.6±0.4 years (range 4.8–17.8), and in controls (n=39) was 14.3±0.5 years (range 6.1–17.6), and were comparable between groups. Following quality assessment, one subject was considered an outlier and so removed from the bioinformatic analysis. A summary of characteristics for remaining 99 included patients are shown in table 1 (individual details are provided in online supplementary table S1). There was no difference in gender distribution between controls or patients with UC. Subjects diagnosed with CD were more likely to be male than female, characteristic for CD in a paediatric population.40 The majority of patients with CD (90%) had active inflammatory colonic or ileocolonic disease; 86.7% of patients with UC exhibited pancolitis. While CoA biopsies were obtained from all 60 patients with IBD, due to the extensive nature of disease, CoN samples were obtained from only 23/30 CD and 2/30 patients with UC. For statistical significance to be achieved, only the CD CoN biopsies were analysed.

Table 1

Patient characteristics

Evaluation of full proteomic data set

One hundred and twenty-four biopsies were processed over a 15-month period and analysed by high performance liquid chromatography electrospray ionization mass spectrometry/mass spectrometry (HPLC-ESI-MSMS) to identify and quantify proteins that are differentially expressed between disease conditions. Detailed evaluation of MS data quality is provided in the online supplementary results. One affected (CoA) CD biopsy was determined to be an outlier (see online supplementary figure S2A) and so both CoA and CoN biopsies from the patient were removed from further analyses. The remaining samples showed consistent MS profiles over time (see online supplementary figure S2B–E).

From the 122 biopsies analysed, 3644 proteins were identified by ≥2 unique peptides, 949 of which were in the Q95; 225 additional proteins were subgroup specific (see online supplementary table S2). Control and affected (CoA) IBD proteomes are distinguished by PCA using proteomic data from these 1174 proteins, whereas non-inflamed (CoN) CD proteomes are more similar to the control group (figure 1). A similar segregation was obtained even when proteins annotated as involved in immunological response (see online supplementary tables S2 and S3) were removed from the dataset (see online supplementary figure S3A). Consistent with previous studies, blood-based parameters were insufficient to segregate IBD from control patients by PCA (see online supplementary figure S3B).

Figure 1

(A) Principal component analysis (PCA) of the Q95+subgroup-specific proteins showing separation of IBD inflamed ascending colon patients proteomes (UC, red; Crohn's disease (CD), blue) from controls (black) and from non-inflamed CD ascending colon (CoN) (grey) based on the of Q95+subgroup-specific proteins.

Establishment of biomarker models

Control versus affected IBD

Evaluation of the Q95+subgroup-specific proteins in control versus CoA IBD (combined CoA CD and CoA UC; see online supplementary table S2) by SVM, PLSDA and RF resulted in 106 common candidate biomarkers (see online supplementary table S4). By step-forward analysis of a PLSDA model, peak and stabilisation of the AUC, specificity and sensitivity were observed with five proteins (see online supplementary figure S5A). This panel of five proteins (table 2), the relative expressions of which are shown in figure 2A, was sufficient to differentiate patients with IBD from controls with an AUC of 1.0 (95% CI 0.99 to 1.0), and a classification accuracy of 96% (figure 2B, C). Notably, the relative expression of all five proteins from CD CoN biopsies was significantly different than both control and CD CoA, with CoN showing intermediary expression (see online supplementary figure S6A).

Table 2

Panel 1 candidate protein biomarkers for the segregation of IBD from control patients

Figure 2

Partial Least Squares Discriminant Analyses (PLSDA) models were trained using data from the discovery cohort and then tested with the data from the validation cohort. They were used to classify control and patients with IBD (inflamed ascending colon biopsy). (A) Relative expression levels of the five candidate biomarkers (panel 1) to separate control from patients with IBD. (B) Panel 1 receiving operating characteristics (ROC) curve of the discovery cohort (blue) and validation cohort (pink) and (C) associated prediction overview for classification using the panel 1 PLSDA model wherein patients to the left of 0.5 would classify as controls and to the right of 0.5 would classify as IBD; true diagnoses of individual patient samples from the discovery and validation cohorts are shown in open or closed symbols, respectively. (D) Principal component analysis using five biomarkers to distinguish IBD (purple) from control (black) population in the discovery and validation cohorts. Statistical significance by Student's t test with ****p<0.0001.

CD versus UC

From the proteins evaluated in the 15 CD and 15 UC discovery cohort proteomes (see online supplementary table S2), 252 were common to SVM, PLSDA and RF (see online supplementary table S5). By step-forward analysis in a PLSDA model, a plateau in specificity and sensitivity was observed at 12 proteins (see online supplementary figure S5B), and thus determined to be the minimal number of proteins required for optimal classification. The relative expression of the 12 proteins is shown (figure 3A). Notably, the mitochondrial proteins trifunctional enzyme hydroxyacyl-CoA dehydrogenase/3-ketoacylCoA thiolase/enoyl-CoA hydratase Beta subunit (HADHB) and tricarboxylate transport protein (SLC25A1) were not significantly different between CD and UC by t test, though they contribute to the specificity and sensitivity of the panel (figure 3B). The panel of 12 proteins (table 3) resulted in an overall AUC of 0.95 (95% CI 0.86 to 1.0) (figure 3B), with a sensitivity and specificity of 1.0 (95% CI 0.78 to 1.0) and 0.933 (95% CI 0.68 to 1.0), respectively (table 4), with only one patient incorrectly classified (figure 3C). As observed for the control versus IBD panel proteins, the relative expression of the 12 proteins identified to separate CD from UC had relative expression in CD CoN biopsies that were between control and CD CoA (see online supplementary figure S7A).

Table 3

Panel 2 candidate protein biomarkers for the segregation of patients with CD from those with UC

Table 4

Characteristics of ROC curves based on panel 1 and panel 2, for the discovery and validation cohorts

Figure 3

Partial Least Squares Discriminant Analyses (PLSDA) models were trained using data from the discovery cohort and then tested with the data from the validation cohort to classify patients with Crohn's disease (CD) and UC from inflamed ascending colon biopsies. (A) Relative expression levels of the 12 candidate biomarkers (panel 2) to separate patients with CD from those with UC. (B) Panel 2 receiving operating characteristics (ROC) curve of the discovery cohort (blue) and validation cohort (pink) and (C) associated prediction overview for classification using the panel 2 PLSDA model wherein patients to the left of 0.5 would classify as CD and to the right of 0.5 would classify as UC; true diagnoses of individual patient samples from the discovery and validation cohorts are shown in open or closed symbols, respectively. (D) Principal component analysis using 12 biomarkers to distinguish CD (blue) from UC (red) population in the discovery and validation cohorts. Statistical significance by Student's t test with *p<0.05, **p<0.005.

Application and performance evaluation of the panels to an independent validation cohort

As outlined (see online supplementary figure S1), the biomarker panel PLSDA models were independently assessed with proteomic data from the validation cohort. Shown in the ROC curve (figure 2B), panel 1 proteins applied to the classification of the validation cohort result in an AUC of 0.99 (95% CI 0.99 to 1.0) with 47 of 49 (95.9%) patients accurately classified as either control or IBD; the associated prediction overview (figure 2C) is shown. PCA using the five panel 1 proteins shows good separation of the control and IBD CoA populations (figure 2D, see online supplementary figure S6B). Similarly, the 12 proteins in panel 2 differentiate CD CoA from UC CoA with an AUC of 0.86 (95% CI 0.72 to 1.0) (figure 3B, see online supplementary figure S7B), with 24 of 30 (80%) patients accurately classified (figure 3C). The four misidentified patients with CD all have ileocolonic disease (Paris L3), whereas the two patients with UC were Paris E2 and E4. Notably, all patients with limited colonic CD (Paris L2) were correctly classified, indicating that despite the common localisation of disease between these patients with CD and UC, the biomarker panel was able to accurately differentiate the patients. PCA performed using the 12 proteins from panel 2 shows good separation of the CD and UC populations (figure 3D). Despite reduced sensitivity and specificity in the validation cohort compared with the discovery group (table 4), these results indicate the utility of the biomarker panels in diagnosis and subdiagnosis of patients with IBD.

Candidate biomarkers are biologically relevant

Pathway analysis was performed to evaluate the functional roles of the 106 IBD and 252 differential diagnostic candidate biomarkers. Most proteins that segregate IBD from control are involved in metabolic processes, and function predominantly in catalysis, specifically oxidoreductase activity (figure 4A, see online supplementary table S6). Canonical pathways identified to differ in IBD are related to metabolism (figure 4B; see online supplementary tables S6 and S7). In particular, proteins elevated in CD are related to fatty acid metabolism, whereas proteins elevated in UC function in energy and amino acid metabolism (figure 4B).

Figure 4

(A) Biological processes of 106 candidate biomarker proteins that contribute to segregation of IBD from control. (B) Metabolic pathways that differ within IBD subtypes; proteins that are upregulated in Crohn's disease (blue lines) are associated with fatty acid metabolism and oxidative phosphorylation, whereas amino acid and energy metabolism are elevated in UC candidate biomarkers (red lines). Pathways with some overlap are shown (dark purple lines). Amino acid metabolism shown with single letter code for alanine (A), aspartic acid (D), glutamic acid (E), arginine (R), proline (P), cysteine (C) and methionine (M).

Correlation with severity

From the 944 Q95+subgroup-specific proteins in the discovery cohort, 118 proteins (12.5%) correlated significantly with PCDAI or PUCAI (figure 5A; see online supplementary table S8). PCDAI was significantly correlated with 83 proteins (see online supplementary table S8), 10% of which are components of the protein ubiquitination pathway. In contrast, 10% of the 43 proteins that correlated with PUCAI were components of the mammalian target of rapamycin signalling pathway. Fifteen of the CD-associated and nine of the UC-associated proteins are regulated by HNF4A, which was identified in a paediatric population to be associated with CD41 and is a UC susceptibility loci.42 Of the 118 proteins showing correlation with severity, 39 proteins were identified as biomarker candidates, four of which were in the biomarker panels. In panel 1 the relative expression of both visfatin and inorganic phosphatase showed significant correlation with CD severity (figure 5B,C). Similarly, the relative expression of panel 2 protein MT2 correlated with CD severity (figure 5D), whereas heterogeneous nuclear ribonucleoprotein H3 (HNRNP H3) was inversely related to UC severity (figure 5E). A previous study found a correlation between MT2 and grade of inflammation in adult IBD biopsies;43 the correlation with disease severity of the other three proteins is a new finding.

Figure 5

(A) Venn diagram of the number of proteins that correlated with severity of disease in patients with Crohn's disease (CD) (blue) or UC (red) as determined by Pearson correlation analysis of the Pediatric Crohn's Disease Activity Index (PCDAI) or Pediatric UC Activity Index (PUCAI) score with the Q95 proteins+subgroup-specific proteins. (B–E) Scatter plots of the relative expression of proteins with significant correlation to severity score that are part of the (B and C) IBD versus control panel, and (D and E) CD versus UC panel are shown.

ELISAs of visfatin and MT2 are consistent with proteomic data

With the ultimate intent of translating our findings into the clinical setting, the absolute amount of two candidate biomarkers (one from each of the panels) was measured by ELISA from validation cohort patient biopsy samples. The amount of visfatin was within the detection limits for 23 of 24 (95.8%) samples tested. The relative amount of visfatin determined by proteomics in the discovery cohort (figure 1C) is consistent in the validation cohort ELISA data (figure 6A), with a significantly higher amount in patients with IBD. Notably, there was no significant difference in the amount of visfatin between CD CoN and CoA biopsies by paired analysis (see online supplementary figure S6C). MT2 was quantified in all validation cohort samples tested by ELISA, and was significantly higher in patients with CD than in those with UC (figure 6B). Consistent with the discovery cohort proteomic data, the ELISA results of the validation cohort showed correlation between the absolute amount of MT2 and the PCDAI in patients with moderate or severe CD (PCDAI>30) (figure 6C). Due to the limited number of patients with mild CD, we could not determine any association with MT2 levels. There was no significant difference in absolute MT2 levels in CoN samples compared with paired CoA samples from patients with CD (see online supplementary figure S7C).

Figure 6

(A) Amount of visfatin per milligram of biopsy protein, determined by enzyme-linked immunosorbent assay (ELISA) in a subset of the validation cohort. (B) Amount of metallothionein-2 (MT2) per milligram of biopsy total protein determined by ELISA in a subset of the validation cohort. (C) Correlation between the ELISA-measured amount of MT2 and the corresponding patient Pediatric Crohn's Disease Activity Index (PCDAI) score in a subset of the Crohn's disease (CD) validation cohort. n=8 each for control, patients with CD and patients with UC.


The accurate classification of IBD subtype remains a significant clinical challenge, particularly in the paediatric population where clinical features are less distinctive and may change with time.44 ,45 Here, we evaluated the proteomes of biopsies taken from 99 paediatric patients at the time of diagnosis, and prior to therapeutic intervention, for the discovery and validation of potential protein biomarkers.

We quantified over 3500 proteins, and identified five proteins that are sufficient to segregate IBD from control patients, and 12 that distinguish CD from UC with an accuracy of >80% (table 4). All candidate biomarkers in our study were quantified in ≥95% of patient biopsies, which is in contrast to markers of disease measuring antibodies to inflammatory or microbial components (eg, perinuclear anti-neutrophil cytoplasmic antibodies (pANCA), anti-Saccharomyces cerevisiae antibody (ASCA), CBir1) that are identified in a limited number of patients (range 2%–85%).46 The performance of our model (table 4) is greater than the serological panel that is often applied in difficult-to-diagnose cases, despite a sensitivity of 0.67 and specificity of 0.76 on a paediatric cohort.18

Current serum and faecal biomarkers for the diagnosis of IBD have limited clinical application due to low selectivity. Standard serological tests can provide information on general inflammation, yet 21% and 54% of paediatric patients with mild CD or UC, respectively, and up to 4% with moderate/severe disease test as normal for haemoglobin, albumin, platelets and erythrocyte sedimentation rate,47 and show limited improvement even with the addition of C-reactive protein (CRP).48 CRP and haemoglobin have utility in differentiating IBD from non-inflammatory conditions21 but do not differentiate IBD from other inflammatory states nor differentiate IBD subtypes.49 ,50 Faecal calprotectin performs well to differentiate conditions with mucosal inflammation from those with similar symptoms, yet are non-inflammatory in nature, such as IBS, but has modest selectivity.51 In our study, biopsy-associated calprotectin levels were not significantly different between groups. This may be reflective of the difference in sample collection for measurement (tissue vs faeces).

The candidate biomarkers identified herein contribute to multiple biological processes, predominantly metabolism (figure 4). Candidate biomarkers in both panel 1 and panel 2 are components of fatty acid metabolism, and the contribution to the pathogenesis of IBD has previously been recognised.52 In our study, we found decreased FABP5 protein levels in children with IBD compared with control patients (table 2). FABP was previously identified in pooled adult IBD colonic specimens to have higher RNA expression in patients with IBD when compared with controls,53 whereas a second study identified a reduction in RNA expression of L-FABP in inflamed UC mucosa.54 Notably, we found proteins involved in fatty acid metabolism to be elevated in the CD over UC paediatric patients (figure 4). Specifically, the proteins leukotriene A-4 hydrolase, tricarboxylate transport protein, trifunctional enzyme and delta(3,5)-delta(2,4)-dienoyl-CoA isomerase were all elevated in patients with CD compared with those with UC. Leukotriene A-4 hydrolase is involved in production of the proinflammatory mediator leukotriene B4, the latter of which was previously found to be elevated in the colon from active patients with IBD.55

Energy metabolism was also identified herein to be altered in IBD, which includes the candidate biomarkers inorganic pyrophosphatase, visfatin and UDP-glucose 6-dehydrogenase. Poulsen et al27 showed an elevation of inorganic pyrophosphatase in inflamed colonic biopsies of patients with UC, concurring with our findings of elevated levels in paediatric IBD biopsies. Waluga et al56 found elevated levels of visfatin in serum of patients with CD and UC, and reduced in patients with CD following treatment. In our study, we found an elevation of visfatin in biopsies of patients with IBD, and a correlation between visfatin levels in the CD paediatric biopsies and the PCDAI score (figure 5B). While the study by Waluga did not find a correlation between visfatin levels and clinical indices of IBD activity,56 extracellular levels were found to correlate with inflammatory indices in colorectal tumours.57 Unlike inorganic pyrophosphatase and visfatin, UDP-glucose 6-dehydrogenase has not previously been associated with IBD. However, the enzyme, which acts on NAD+ downstream of visfatin activity, is a predictive biomarker of colitis-associated cancer risk in patients with both CD and UC.58

Nine biomarker panel proteins are not components of metabolic pathways, but have roles in binding and transport of metals, protein and RNA. Cytosol aminopeptidase, found in our study to be elevated in CD when compared with UC (table 3), was previously identified by label-free proteomics to have higher expression in patients with CD compared with those with UC and control patients.23 Interestingly, serum levels of the protein were not elevated in a cohort of adult patients with UC, but the authors indicated that further testing of colon samples should be performed.59 The relative expression of metallothioneins in IBD, observed in multiple studies, appears to be dependent on study-specific factors including the location of isolation and therapeutic intervention.60 In agreement with our observed correlation between relative MT2 levels and PCDAI (figures 5D and 6C), correlation between grade of inflammation with metallothionein in adult IBD biopsies was shown.43 Further, extracellular antibody blockage of metallothionein resulted in a reduction of colitis in a mouse model,43 suggesting a contribution of this candidate biomarker to disease pathogenesis. Levels of serotransferrin and the circulating form of transferrin receptor are indicators of anaemia in paediatric patients with IBD.61 Herein, transferrin receptor protein-1 was elevated in patients with CD compared with those with UC, whereas serotransferrin was found to be higher in patients with UC than those with CD (figure 3A). Interestingly, serotransferrin was found to be elevated in adult patients with CD that did not respond to infliximab treatment.62 Leucine PPR motif-containing protein, Sec61a1, Staphylococcal nuclease domain-containing protein, HNRNP H3 and β-2 microglobulin have not previously been linked with IBD. In ongoing proteomic analysis, further characterisation of the contribution of these and other non-biomarker proteins to the pathogenesis of IBD will be evaluated. While the biomarker study presented herein has been limited to the Q95 (to ensure the identification of broadly applicable biomarkers), in future studies we will assess all proteins that are different from controls or between IBD subtypes at inception, as well as those that change in the same patients following therapeutic intervention.

Overall, we have identified candidate biomarkers from inflamed tissue in samples of initial disease in therapy-naïve patients. The proteins identified in panels 1 and 2 are quantified in ≥95% of the paediatric cohort, indicating the applicability in diagnosis at initial endoscopy. Though several of the panel 1 and 2 proteins have previously been associated with IBD, in this study we show the utility of these proteins together as biomarkers of IBD and, importantly, for the differentiation of CD and UC. The panels were able to classify patient populations with >80% accuracy. At this stage, the identified proteins are reflective of relative changes; further work is required to translate these findings into the clinical setting, specifically to develop methods for absolute quantification of the proteins and development of an appropriate multivariable prediction model for disease classification. The evaluation of visfatin and MT2 by ELISA and validation of the MS data strengthen the possibility of translating this knowledge to the clinic with relative ease. Our IBD and control subjects were those with subacute and chronic complaints. We have not applied these analyses to those with acute self-limited infectious colitis or other inflammatory conditions that may be found elsewhere in the intestinal tract such as coeliac disease. However, herein we have identified biomarkers that are able to subdiagnose paediatric patients with subacute to chronic presentations of disease at inception. The use of an inception cohort is an important distinction of this study, and increases the applicability of these findings from studies that have searched for disease biomarkers from patients undergoing treatment. The application of these panels of proteins for use as biomarkers could enable for differential diagnosis of IBD subtypes to permit for appropriate therapeutics.


The authors would like to thank A. Mack for the contributions that made this study feasible.


View Abstract


  • Contributors AES, C-KC, WM, DRM, AS and DF were involved in study design. RS, EIB and DRM performed patient enrolment, patient diagnosis, sample isolation and collection of clinical data; AES, SAD and C-KC performed the experimental work; AES, SAD, DRM, EIB, ZN, XZ, MW and DF were all involved in interpretation of results. AES and DF drafted the manuscript. All authors have read and were involved in critical revisions of the final paper.

  • Funding This study was supported in part by the Government of Canada through Genome Canada and the Ontario Genomics Institute (OGI-067), CIHR grant number GPH-129340, CIHR grant number MOP-114872 and the Ontario Ministry of Economic Development and Innovation (REG1–4450). The authors also acknowledge funding from the IBD Foundation, Crohn's and Colitis Canada (CCC), the Children Hospital of Eastern Ontario Research Institute and the Faculty of Medicine of the University of Ottawa. EIB was supported by a Career Development Award from the Canadian Child Health Clinician Scientist Program, and a New Investigator Award from CIHR, CCC and the Canadian Association of Gastroenterology. DF acknowledges a Canada Research Chair 1 in proteomics and systems biology. REDCap is supported by Clinical and Translational Science Award (CTSA) Grant (UL1 TR000448) and Siteman Comprehensive Cancer Center and NCI Cancer Center Support Grant P30 CA091842.

  • Competing interests None declared.

  • Patient consent Obtained.

  • Ethics approval Ethics approval was provided by the Research Ethics Board of the Children's Hospital of Eastern Ontario.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Proteomics data will be made available through ProteomeXchange.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.