Article Text

Original research
Efficacy of biological therapies and small molecules in induction and maintenance of remission in luminal Crohn’s disease: systematic review and network meta-analysis
  1. Brigida Barberio1,
  2. David J Gracie2,
  3. Christopher J Black2,
  4. Alexander C Ford2,3
  1. 1 Department of Surgery, Oncology and Gastroenterology, University of Padua, Padova, Italy
  2. 2 Leeds Gastroenterology Institute, Leeds Teaching Hospitals NHS Trust, Leeds, UK
  3. 3 Leeds Institute of Medical Research at St. James's, University of Leeds, Leeds, UK
  1. Correspondence to Professor Alexander C Ford, Leeds Teaching Hospitals NHS Trust, Leeds, Leeds, UK; alexf12399{at}yahoo.com

Abstract

Objective There are numerous biological therapies and small molecules licensed for luminal Crohn’s disease (CD), but these are often studied in placebo-controlled trials, meaning relative efficacy is uncertain. We examined this in a network meta-analysis.

Design We searched the literature to 1 July 2022, judging efficacy according to induction of clinical remission, clinical response and maintenance of clinical remission, and according to previous exposure or non-exposure to biologics. We used a random effects model and reported data as pooled relative risks (RRs) with 95% CIs, ranking drugs according to p-score.

Results We identified 25 induction of remission trials (8720 patients). Based on failure to achieve clinical remission, infliximab 5 mg/kg ranked first versus placebo (RR=0.67, 95% CI 0.56 to 0.79, p-score 0.95), with risankizumab 600 mg second and upadacitinib 45 mg once daily third. However, risankizumab 600 mg ranked first for clinical remission in biologic-naïve (RR=0.66, 95% CI 0.52 to 0.85, p-score 0.78) and in biologic-exposed patients (RR=0.74, 95% CI 0.67 to 0.82, p-score 0.92). In 15 maintenance of remission trials (4016 patients), based on relapse of disease activity, upadacitinib 30 mg once daily ranked first (RR=0.61, 95% CI 0.52 to 0.72, p-score 0.93) with adalimumab 40 mg weekly second, and infliximab 10 mg/kg 8-weekly third. Adalimumab 40 mg weekly ranked first in biologic-naïve patients (RR=0.59, 95% CI 0.48 to 0.73, p-score 0.86), and vedolizumab 108 mg 2-weekly first in biologic-exposed (RR=0.70, 95% CI 0.57 to 0.86, p-score 0.82).

Conclusion In a network meta-analysis, infliximab 5 mg/kg ranked first for induction of clinical remission in all patients with luminal CD, but risankizumab 600 mg was first in biologic-naïve and biologic-exposed patients. Upadacitinib 30 mg once daily ranked first for maintenance of remission.

  • crohn's disease
  • meta-analysis

Data availability statement

No data are available.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What is already known on this subject?

  • Crohn’s disease (CD) is a chronic inflammatory bowel disease leading to progressive intestinal damage and a high likelihood of requiring surgery within 10 years of first diagnosis.

  • Although immunomodulators are often used to reduce need for treatment with corticosteroids, evidence for their efficacy is limited.

  • In the last 20 years, multiple biological therapies and small molecules have been developed and licensed for luminal CD.

  • Although previous network meta-analyses have compared their efficacy and safety, trials of newer drugs have been published.

What are the new findings?

  • In terms of induction of clinical remission, although infliximab 5 mg/kg ranked first in all patients, risankizumab 600 mg ranked first in both biologic-naïve patients and biologic-exposed patients.

  • In terms of clinical response, infliximab 5 mg/kg ranked first in all patients, but risankizumab 1200 mg was first in both biologic-naïve and biologic-exposed patients.

  • In terms of maintenance of clinical remission, upadacitinib 30 mg once daily ranked first, with adalimumab 40 mg weekly ranked first in biologic-naïve patients and vedolizumab 108 mg 2-weekly ranked first in biologic-exposed patients.

  • None of the drugs studied were more likely to lead to adverse events, serious adverse events or infections than placebo in either induction of remission or maintenance of remission trials.

How might it impact on clinical practice in the foreseeable future?

  • These data can be used to facilitate treatment decisions for patients with moderate to severe luminal CD to induce remission, and for patients with luminal CD that has responded to therapy to reduce likelihood of relapse.

  • They can be used to update future evidence-based management guidelines and could inform health economic analyses to help guide cost-effective treatment selection.

  • It is important to point out that the trials of upadacitinib are yet to be published in full.

Introduction

Crohn’s disease (CD) is a chronic disorder involving any region of the intestine, although most commonly the ileocaecum, and causing transmural inflammation.1 Contemporaneous prevalence data demonstrate that CD affects almost 300 per 100 000 people in Europe.2 Due to the fact that incidence of CD exceeds mortality,3 prevalence is likely to rise in the foreseeable future. Although some patients will have a mild disease course, for the majority, the condition follows a relapsing and remitting course, with progressive intestinal damage.4 Up to 50% of patients will require surgery within 10 years from first diagnosis.5 Unlike in UC, where 5-aminosalicylates are the mainstay of treatment,6 there is little evidence for use of these drugs in luminal CD.7 Many patients, therefore, require immunomodulator drugs, such as azathioprine or methotrexate, to avoid the need for repeated courses of corticosteroids. However, evidence for efficacy of these drugs is not strong, with few randomised controlled trials (RCTs).8

Since the advent of infliximab, a drug targeting the proinflammatory cytokine tumour necrosis factor-α (TNF-α), which demonstrated efficacy in clinical trials for induction of remission of active luminal CD,9 prevention of relapse of quiescent luminal CD,10 and healing and prevention of recrudescence of fistulising CD,11 12 multiple novel drugs have been developed. These include other drugs targeting TNF-α, such as adalimumab or certolizumab, and drugs acting on integrins or other proinflammatory cytokines implicated in the pathogenesis of CD, such as vedolizumab, ustekinumab or risankizumab. More recently, small molecules, which can be administered orally, and on a daily basis, have been evaluated in inflammatory bowel disease.13 These include janus kinase inhibitors, such as tofacitinib, filgotinib and upadacitinib.

The relative efficacy and safety of some of these drugs have been assessed previously using network meta-analysis.14 15 These have, for the most part, demonstrated that anti-TNF-α drugs, such as infliximab and adalimumab, are the most efficacious, in all patients, and in those with previous anti-TNF-α exposure. However, even since the most recent of these network meta-analyses, there have been new trials published of risankizumab,16 an interleukin-23 p-19 inhibitor, and upadacitinib,17 a preferential janus kinase-1 inhibitor, in luminal CD. In addition, there are other studies that were not included in previous network meta-analyses. We, therefore, performed a contemporaneous network meta-analysis to evaluate the efficacy of all biological therapies and small molecules that have progressed on to phase III trials in luminal CD, compared with each other or placebo, in terms of induction of clinical remission, clinical response and maintenance of clinical remission, as well as safety.

Methods

Search strategy and selection criteria

We searched MEDLINE (1946 to 1 July 2022), Embase and Embase Classic (1947 to 1 July 2022), and the Cochrane Central Register of Controlled Trials. We also searched clinicaltrials.gov for recently completed trials or supplementary data for potentially eligible RCTs. In addition, we hand-searched conference proceedings (Digestive Diseases Week, American College of Gastroenterology, United European Gastroenterology Week and the Asian Pacific Digestive Week) between 2001 and 2022 to identify trials published only in abstract form. Finally, we performed a recursive search of the bibliographies of all eligible articles.

To be eligible, RCTs had to examine efficacy of biological therapies (anti-TNF-α antibodies (infliximab, adalimumab or certolizumab), anti-integrin antibodies (vedolizumab or etrolizumab), anti-interleukin-12/23 antibodies (ustekinumab) or anti-interleukin-23 antibodies (risankizumab)) or janus kinase inhibitors (tofacitinib, filgotinib or upadacitinib) at the doses taken through into phase III clinical trials. Studies needed to recruit adults (≥18 years) with luminal CD (online supplemental table 1) and compare biological therapies or small molecules with placebo, or each other. Trials conducted only in patients with perianal CD were ineligible. We required a minimum follow-up duration of 4 weeks for induction of remission trials in moderately to severely active luminal CD and 20 weeks for maintenance of remission in luminal CD. Maintenance of remission trials had to rerandomise patients at baseline; run-through trials of active drug or placebo from baseline that reported both induction of remission and maintenance of remission were ineligible.

Supplemental material

Two investigators (BB and ACF) conducted independent literature searches. We identified studies on CD with the terms inflammatory bowel disease, or Crohn’s disease (both as Medical Subject Headings and free-text terms). We used the set operator AND to combine these with studies identified with the following terms: infliximab, remicade, adalimumab, humira, certolizumab, cimzia, vedolizumab, entyvio, etrolizumab, ustekinumab, stelara, risankizumab, tofacitinib, xeljanz, filgotinib or upadacitinib, applying a clinical trials filter. There were no language restrictions. Two investigators (BB and ACF) evaluated all abstracts identified independently. We obtained potentially relevant papers and evaluated them in more detail, using predesigned forms, to assess eligibility independently according to the predefined criteria. We translated foreign language papers, where required. We resolved disagreements between investigators by discussion.

Outcome assessment

In induction of remission trials, we assessed efficacy of biological therapies or small molecules, compared with placebo or each other, in terms of failure to achieve clinical remission (Crohn’s Disease Activity Index (CDAI) <150) or failure to achieve clinical response (fall in CDAI of ≥70), at the point the primary endpoint was assessed in each trial. In maintenance of remission trials, we assessed efficacy in terms of occurrence of relapse of disease activity (CDAI ≥150) at the last point of follow-up. Other outcomes assessed included adverse events (total numbers of adverse events, as well as serious adverse events, infections and adverse events leading to study withdrawal), if reported.

Data extraction

Two investigators (BB and ACF) extracted data from all eligible studies independently onto a Microsoft Excel spreadsheet (XP Professional Edition; Microsoft Corp, Redmond, Washington, USA) as dichotomous outcomes (clinical remission or no clinical remission, clinical response or no clinical response, and relapse of disease activity or no relapse of disease activity). We assessed efficacy according to the proportion of patients failing to achieve clinical remission or clinical response in induction of remission trials, and the proportion of patients having relapse of disease activity in maintenance of remission trials. We also extracted the following data for each trial, where available: country of origin, number of centres, disease distribution, proportion of patients naïve to biological therapy, dose and dosing schedule of active therapy and placebo and source of patients in maintenance of remission trials. We extracted all data as intention-to-treat analyses, assuming all dropouts to be treatment failures (ie, no remission or response to biological therapy, small molecule or placebo, or relapse of disease activity with biological therapy, small molecule or placebo), wherever trial reporting allowed. If this was not clear from the original article, we performed an analysis on all evaluable patients. When judging safety, we used the number of patients receiving at least one dose of the study drug, wherever possible. We compared results of the two investigators’ data extraction with all discrepancies resolved by discussion.

Quality assessment and risk of bias

We used the Cochrane risk of bias tool to assess this at the study level.18 Two investigators (BB and ACF) performed this independently. We resolved disagreements by discussion. We recorded the method used to generate the randomisation schedule and conceal treatment allocation, as well as whether blinding was implemented for participants, personnel and outcome assessment, whether there was evidence of incomplete outcomes data and whether there was evidence of selective reporting of outcomes.

Data synthesis and statistical analysis

We performed a network meta-analysis using the frequentist model, with the statistical package ‘netmeta’ V.0.9–0 (https://cran.r-project.org/web/packages/netmeta/index.html) in R V.4.0.2. We explored direct and indirect treatment comparisons of efficacy and safety of each drug, with reporting according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension statement for network meta-analyses.19 Compared with standard, pairwise, analyses network meta-analysis results can give more precise estimates20 21 and allow ranking of drugs to inform clinical decisions.22

We produced a network plot with node size corresponding to the number of study subjects and connection size corresponding to number of studies to examine the symmetry and geometry of the evidence. We assessed for publication bias or other small study effects, for all available comparisons, via comparison adjusted funnel plots, using Stata V.16 (StataCorp). This is a scatterplot of effect size versus precision, measured via the inverse of the SE. Absence of publication bias, or small study effects, is indicated by symmetry around the effect estimate line.23 We judged efficacy of each comparison tested with a pooled relative risk (RR) with 95% CIs, using a random effects model as a conservative estimate. We used an RR of failure to achieve clinical remission or clinical response, and an RR of relapse of disease activity. We checked the correlation between direct and indirect pieces of evidence across the network via consistency modelling, where there were direct comparisons between some active drugs,24 generating network heat plots. These have grey squares representing the size of the contribution of the direct estimate of one study design in columns, compared with the network estimate in rows.25 The coloured squares around these represent the change in inconsistency between direct and indirect evidence in a network estimate in the row after relaxing the consistency assumption for the effect of one design in the column. Blue squares indicate that the direct evidence of the design in the column supports the indirect evidence in the row, whereas red squares are ‘hotspots’ of inconsistency between direct and indirect evidence.

Meta-analyses often use the I2 statistic to measure heterogeneity, which ranges between 0% and 100%.26 This statistic is easily interpretable and does not vary with the number of studies. However, its value can increase with the number of patients included in the meta-analysis.27 We, therefore, used the τ2 measure from the ‘netmeta’ statistical package to assess global statistical heterogeneity across all comparisons. We used estimates of τ2 of approximately 0.04, 0.16 and 0.36 to represent low, moderate, and high levels of heterogeneity, respectively.28

We used the p-score, which is a value between 0 and 1, for all biological therapies and small molecules, versus placebo or each other, to rank them. P-scores are based solely on point estimates and SEs from the network estimates and measure the mean extent of certainty that one intervention is better than another, averaged over all competing interventions.29 Higher scores indicate a greater probability of the intervention being ranked as best,29 but the magnitude of the p-score should be considered, as well as the rank. The mean value of the p-score is always 0.5. Therefore, if individual interventions cluster around this value, they are likely to be similarly efficacious. However, it is also important to take the RR and corresponding 95% CI for each comparison into account when interpreting the results, rather than relying on rankings alone.30 In our primary analyses, we pooled data for all patients, but we also performed a priori subgroup analyses for each efficacy endpoint according to whether or not patients had been exposed to biologics previously. For our analysis of achievement of clinical remission, we used the Confidence in Network Meta-Analysis (CINeMA) framework to evaluate confidence in the indirect and direct treatment estimates from the network,31 32 which is endorsed by the Cochrane Collaboration. This includes the Risk of Bias from Missing Evidence in Network Meta-Analysis tool for evaluation of reporting bias.33

Results

The search strategy generated 4095 citations. In total, 100 appeared relevant, and we retrieved these for further assessment. We excluded 67 that did not fulfil eligibility criteria, with reasons provided in online supplemental figure 1, leaving 33 separate eligible articles. Agreement between investigators for study eligibility was excellent (kappa statistic=0.85). Eighteen articles reported on 19 induction of remission RCTs only9 16 17 34–47 (NCT00291668); five articles reported six induction of remission trials together with rerandomisation of patients in these RCTs into five subsequent maintenance of remission trials48–52; and 10 articles reported maintenance of remission RCTs only.10 53–61 In total, therefore, there were 23 articles reporting on 25 separate induction of remission RCTs9 16 17 34–52 (NCT00291668) and 15 articles reporting on 15 separate maintenance of remission trials.10 48–61 All studies were funded by pharmaceutical companies.

Of the induction of remission RCTs, one was published in abstract form17; one was reported online46; and one was available on clinicaltrials.gov (NCT00291668). These 25 RCTs included 8720 patients, randomised to active drug or placebo (online supplemental table 2). Characteristics of individual studies are provided in online supplemental table 3, and risk of bias of all trials is provided in online supplemental table 4. Fifteen induction of remission RCTs, reported in 14 articles,34–36 38–40 42 43 45 47–50 52 were low risk of bias across all domains. The 15 maintenance of remission trials included 4016 patients, randomised to active drug or placebo (online supplemental table 5), one of which was reported online.61 Characteristics of individual studies are provided in online supplemental table 6. Risk of bias for maintenance of remission trials is reported in online supplemental table 7, with five at low risk of bias across all domains.48–50 53 55

Achievement of clinical remission

All 25 induction of remission trials reported data for this endpoint at between 4 weeks and 16 weeks9 16 17 34–52 (NCT00291668). The network plot is provided in figure 1. When data were pooled, there was low heterogeneity (τ2=0.0013), and the funnel plot appeared symmetrical (online supplemental figure 2). All drugs, other than infliximab 10 mg/kg, adalimumab 80/40 mg and certolizumab 400 mg, were superior to placebo. Infliximab 5 mg/kg ranked first for efficacy (RR of failure to achieve clinical remission=0.67, 95% CI 0.56 to 0.79, p-score 0.95) (figure 2A), meaning that the probability of infliximab 5 mg/kg being most efficacious was 95%. Risankizumab 600 mg (RR=0.73, 95% CI 0.66 to 0.80, p-score 0.85) and upadacitinib 45 mg o.d. (RR=0.75, 95% CI 0.68 to 0.83, p-score 0.77) ranked second and third, respectively. The network heat plot had no red hotspots of inconsistency (online supplemental figure 3). After direct and indirect comparisons, infliximab 5 mg/kg was superior to ustekinumab 6 mg/kg and 130 mg, infliximab 10 mg/kg, adalimumab 80/40 mg, vedolizumab 300 mg and certolizumab 400 mg (table 1). Risankizumab 600 mg was superior to ustekinumab 6 mg/kg and 130 mg, adalimumab 80/40 mg, vedolizumab 300 mg and certolizumab 400 mg; upadacitinib 45 mg o.d. was superior to adalimumab 80/40 mg, ustekinumab 130 mg, vedolizumab 300 mg, and certolizumab 400 mg; and risankizumab 1200 mg and adalimumab 160/80 mg were both superior to ustekinumab 130 mg, vedolizumab 300 mg and certolizumab 400 mg. One trial of infliximab used a single infusion of drug or placebo at week 0.9 Excluding this trial in a sensitivity analysis, we found that infliximab 5 mg/kg still ranked first (RR=0.61, 95% CI 0.49 to 0.76, p-score 0.98), followed by risankizumab 600 mg (p-score 0.83) and upadacitinib 45 mg o.d. (p-score 0.76) (online supplemental figure 4). Using the CINeMA framework to evaluate confidence in the results of this endpoint, all direct and indirect comparisons across the network were rated as either high or moderate confidence (online supplemental table 8).

Table 1

League table for failure to achieve clinical remission: all patients with moderate to severe luminal Crohn’s disease

Figure 1

Network plot for failure to achieve clinical remission: all patients with moderate to severe luminal CD. Note: circle (node) size is proportional to the number of study participants assigned to receive each intervention. The line width (connection size) corresponds to the number of studies comparing the individual interventions.

Figure 2

(A) Forest plot for failure to achieve clinical remission: all patients with moderate to severe luminal CD. (B) Forest plot for failure to achieve clinical remission: patients with moderate to severe luminal CD naïve to biological therapies. (C) Forest plot for failure to achieve clinical remission: patients with moderate to severe luminal CD exposed to biological therapies previously. Note: The p-score is the probability of each intervention being ranked as best in the network. CD, Crohn’s disease; RR, relative risk.

Seven trials reported clinical remission in a subset of patients naïve to biological therapies,16 39 43 44 48 50 51 and another seven trials only recruited patients naïve to these drugs.9 34 35 37 38 42 47 Therefore, in total, there were 14 separate RCTs recruiting 2911 patients. When data were pooled, there was low heterogeneity (τ2=0.0053). In patients naïve to biologics, all drugs, other than infliximab 10 mg/kg and certolizumab 400 mg, were superior to placebo. Risankizumab 600 mg ranked first for clinical remission (RR of failure to achieve clinical remission=0.66, 95% CI 0.52 to 0.85, p-score 0.78) (figure 2B), with infliximab 5 mg/kg performing similarly in second (RR=0.67, 95% CI 0.55 to 0.82, p-score 0.78), risankizumab 1200 mg third (RR=0.69, 95% CI 0.54 to 0.88, p-score 0.72) and adalimumab 160/80 mg fourth (RR=0.70, 95% CI 0.61 to 0.81, p-score 0.70). On direct and indirect comparison risankizumab 600 mg, infliximab 5 mg/kg, and adalimumab 160/80 mg were superior to certolizumab 400 mg, but there were no other significant differences (online supplemental table 9). After excluding the trial of infliximab that only used a single infusion of drug or placebo at week 0,9 infliximab 5 mg/kg ranked first (RR=0.61, 95% CI 0.48 to 0.78, p-score 0.86) and risankizumab 600 mg ranked second (p-score 0.74) (online supplemental figure 5).

Seven RCTs reported on clinical remission in a subset of patients exposed to biological therapies previously,16 39 43 44 48 50 51 and six trials recruited only patients with previous exposure to these drugs.16 17 36 45 49 52 There were 3785 patients included in these 13 trials, and low heterogeneity between them (τ2=0). In this analysis, all drugs other than adalimumab 160/160 mg, vedolizumab 300 mg, and adalimumab 80/40 mg were superior to placebo, with risankizumab 600 mg ranked first (RR of failure to achieve clinical remission=0.74, 95% CI 0.67 to 0.82, p-score 0.92) (figure 2C). On direct and indirect comparisons, risankizumab 600 mg was superior to ustekinumab 6 mg/kg and 130 mg, vedolizumab 300 mg; and adalimumab 80/40 mg; upadacitinib 45 mg and risankizumab 1200 mg were superior to ustekinumab 130 mg, vedolizumab 300 mg and adalimumab 80/40 mg; and adalimumab 160/160 mg and ustekinumab 6 mg/kg were superior to vedolizumab 300 mg (online supplemental table 10).

Achievement of clinical response

Clinical response was reported by 24 induction of remission trials at 6–12 weeks (online supplemental figure 6)9 16 17 34–45 47–52 (NCT00291668). There was low heterogeneity between studies (τ2=0.0109), and the funnel plot appeared symmetrical (online supplemental figure 7). All drugs, other than infliximab 10 mg/kg and certolizumab 400 mg, were superior to placebo, but infliximab 5 mg/kg ranked first (RR of no clinical response=0.54, 95% CI 0.41 to 0.70, p-score 0.91), followed by risankizumab 1200 mg (RR=0.57, 95% CI 0.47 to 0.69, p-score 0.87) and adalimumab 160/160 mg (RR=0.59, 95% CI 0.41 to 0.87, p-score 0.76) (figure 3A). The network heat plot had no red hotspots of inconsistency (online supplemental figure 8). Infliximab 5 mg/kg and risankizumab 1200 mg were superior to ustekinumab 130 mg, vedolizumab 300 mg and certolizumab 400 mg (table 2). Risankizumab 600 mg and adalimumab 160/80 mg were superior to vedolizumab 300 mg and certolizumab 400 mg, and ustekinumab 6 mg/kg to certolizumab 400 mg. All but four of these trials used a decrease in CDAI score of ≥100 to define clinical response.9 38–40 Excluding these four studies in a sensitivity analysis, we found that infliximab 5 mg/kg remained first (RR=0.51, 95% CI 0.37 to 0.69, p-score 0.95), followed by risankizumab 1200 mg (RR=0.56, 95% CI 0.47 to 0.67, p-score 0.90) and risankizumab 600 mg (RR=0.63, 95% CI 0.54 to 0.74, p-score 0.76) (online supplemental figure 9).

Figure 3

(A) Forest plot for failure to achieve clinical response: all patients with moderate to severe luminal CD. (B) Forest plot for failure to achieve clinical response: patients with moderate to severe luminal CD naïve to biological therapies. (C) Forest plot for failure to achieve clinical response: patients with moderate to severe luminal CD exposed to biological therapies previously. Note: The p-score is the probability of each intervention being ranked as best in the network. CD, Crohn’s disease; RR, relative risk.

Table 2

League table for failure to achieve clinical response: all patients with moderate to severe luminal Crohn’s disease

Eight trials reported on clinical response in a subset of patients naïve to biologics,16 41 43 44 48 50–52 and seven trials recruited only patients naïve to these drugs.9 34 35 37 38 42 47 Therefore, data from 15 separate RCTs, recruiting 3392 patients, were pooled. There was low heterogeneity between studies (τ2=0.0028), and overall risankizumab 1200 mg ranked first (RR=0.51, 95% CI 0.37 to 0.71, p-score 0.88), followed by infliximab 5 mg/kg (RR=0.53, 95% CI 0.42 to 0.67, p-score 0.85) and adalimumab 160/80 mg (RR=0.57, 95% CI 0.48 to 0.69, p-score 0.76) (figure 3B). Risankizumab 1200 mg and infliximab 5 mg/kg were superior to infliximab 10 mg/kg, vedolizumab 300 mg and certolizumab 400 mg; adalimumab 160/80 mg and ustekinumab 6 mg/kg were superior to vedolizumab 300 mg and certolizumab 400 mg; and ustekinumab 130 mg was superior to certolizumab 400 mg (online supplemental table 11).

Eight RCTs reported on clinical response in a subset of patients exposed to biological therapy previously,16 41 43 44 48 50–52 and six trials recruited only patients previously exposed to these drugs.16 17 36 45 49 52 There were 4077 patients randomised in these 14 RCTs. Overall, there was low heterogeneity between studies (τ2=0.0056). All drugs, other than adalimumab 80/40 mg, vedolizumab 300 mg and certolizumab 400 mg, were superior to placebo. Risankizumab 1200 mg was again ranked first (RR=0.58, 95% CI 0.48 to 0.69, p-score 0.93), with risankizumab 600 mg ranked second (RR=0.63, 95% CI 0.54 to 0.74, p-score 0.83) and upadacitinib 45 mg o.d. ranked third (0.68, 95% CI 0.56 to 0.84, p-score 0.70) (figure 3C). The league ranking is provided in online supplemental table 12. Risankizumab 1200 mg was superior to all drugs, other than risankizumab 600 mg, upadacitinib 45 mg o.d. and adalimumab 80/40 mg. Risankizumab 600 mg and upadacitinib 45 mg o.d. were superior to vedolizumab 300 mg and certolizumab 400 mg.

Maintenance of clinical remission

The 15 maintenance of remission trials reported data between 22 weeks and 60 weeks.10 48–61 The network plot is provided in online supplemental figure 10. When data were pooled, there was low heterogeneity (τ2=0), and the funnel plot appeared symmetrical (online supplemental figure 11), although there were several small studies around the line of no effect. Running the pairwise data confirmed there was no funnel plot asymmetry (Egger test, p=0.85). All drugs, other than infliximab 120–240 mg 2-weekly, risankizumab 360 mg 8-weekly, and ustekinumab 90 mg 12-weekly, were superior to placebo. Upadacitinib 30 mg o.d. ranked first for efficacy (RR of relapse of disease activity=0.61, 95% CI 0.52 to 0.72, p-score 0.93) (figure 4A), with adalimumab 40 mg weekly (RR=0.66, 95% CI 0.57 to 0.76, p-score 0.84) and infliximab 10 mg/kg 8-weekly (RR=0.69, 95% CI 0.59 to 0.80, p-score 0.74) second and third, respectively. The network heat plot had no red hotspots of inconsistency (online supplemental figure 12). After direct and indirect comparison, upadacitinib 30 mg o.d. was superior to all drugs other than adalimumab 40 mg weekly, infliximab 10 mg/kg 8-weekly, adalimumab 40 mg 2-weekly, infliximab 120–240 mg 2-weekly, certolizumab 400 mg 4-weekly, and risankizumab 180 mg 8-weekly (table 3). Adalimumab 40 mg weekly was superior to vedolizumab 300 mg 4-weekly and infliximab 5 mg/kg 8-weekly. Two of these trials recruited patients irrespective of response to open-label treatment,54 57 so we excluded these in a sensitivity analysis. In this analysis, the top three ranked drugs were unchanged (online supplemental figure 13).

Figure 4

(A) Forest plot for failure to maintain clinical remission: all rerandomised patients with luminal CD. (B) Forest plot for failure to maintain clinical remission: rerandomised patients with luminal CD naïve to biological therapies. (C) Forest plot for failure to maintain clinical remission: rerandomised patients with luminal CD exposed to biological therapies previously. Note: the p-score is the probability of each intervention being ranked as best in the network. CD, Crohn’s disease; RR, relative risk.

Table 3

League table for failure to maintain clinical remission: all re randomised patients with luminal Crohn’s disease

Six trials reported on maintenance of clinical remission in a subset of patients naïve to biologics,48 50 52 55 59 60 and another four trials only recruited patients naïve to these drugs.10 53 54 56 Therefore, in total, there were 10 separate RCTs recruiting 1523 patients. There was low heterogeneity between studies (τ2=0), with adalimumab 40 mg weekly ranked first (RR=0.59, 95% CI 0.48 to 0.73, p-score 0.86), adalimumab 40 mg 2-weekly ranked second (RR=0.66, 95% CI 0.55 to 0.80, p-score 0.70), and ustekinumab 90 mg 8-weekly ranked third (RR=0.67, 95% CI 0.47 to 0.97, p-score 0.65) (figure 4B). Infliximab 10 mg/kg and 5 mg/kg 8-weekly and vedolizumab 300 mg 8-weekly were also superior to placebo. After direct and indirect comparison, adalimumab 40 mg weekly was superior to infliximab 5 mg/kg 8-weekly and vedolizumab 108 mg 2-weekly, but there were no other significant differences (online supplemental table 13).

Finally, six RCTs reported on maintenance of clinical remission in a subset of patients exposed to biological therapy previously,48 50 52 55 59 60 and one trial recruited only patients previously exposed to these drugs.49 There were 1382 patients randomised in these seven RCTs, with low heterogeneity between studies (τ2=0). All drugs, other than risankizumab 360 mg 8-weekly and ustekinumab 90 mg 12-weekly were superior to placebo, but vedolizumab 108 mg 2-weekly subcutaneously ranked first (RR=0.70, 95% CI 0.57 to 0.86, p-score 0.82), with adalimumab 40 mg weekly ranked second (RR=0.73, 95% CI 0.61 to 0.88, p-score 0.73) and adalimumab 40 mg 2-weekly ranked third (RR=0.77, 95% CI 0.66 to 0.90, p-score 0.61) (figure 4C). The league ranking is provided in online supplemental table 14). There were no significant differences between any active drugs.

Adverse events

Complete adverse events data for both induction and maintenance of remission trials are provided in the online supplemental materials.

Discussion

We report a contemporaneous systematic review and network meta-analysis of biological therapies and small molecules in luminal CD. We included data from more than 8700 patients in 25 induction of remission trials. Our analysis suggested that infliximab 5 mg/kg was the most efficacious drug when data from all patients were pooled. All comparisons across this network were rated as either high or moderate confidence. However, when we analysed the data for biologic-naïve or exposed patients separately, risankizumab 600 mg ranked first for both groups, suggesting that the ranking of infliximab 5 m/kg was driven by its use in biologic-naïve patients in all trials in which it was studied. In all trials, upadacitinib 45 mg o.d. performed similarly to risankizumab 600 mg and was ranked third. In biologic-naïve patients, infliximab 5 mg/kg ranked second and risankizumab 1200 mg third. In biologic-exposed patients, upadacitinib 45 mg o.d. ranked second followed by risankizumab 1200 mg. In terms of clinical response, infliximab 5 mg/kg ranked first when all patients were considered, but risankizumab 1200 mg was first in both biologic-naïve and biologic-exposed patients. Analysing data from 15 maintenance of remission trials, recruiting over 4000 patients, upadacitinib 30 mg o.d. ranked first for efficacy, followed by adalimumab 40 mg weekly and infliximab 10 mg/kg 8-weekly. When data were pooled according to previous biologic exposure, adalimumab 40 mg weekly ranked first in biologic-naïve patients, and vedolizumab 108 mg 2-weekly ranked first in biologic-exposed patients. Finally, there was no significant increase in total numbers of adverse events, serious adverse events or infections with any drug over placebo, although withdrawals due to adverse events were significantly more likely with infliximab 10 mg/kg 8-weekly in maintenance of remission trials.

Limitations of this network meta-analysis include the fact that only 15 of 25 induction of remission trials and 5 of 15 maintenance of remission RCTs were at low risk of bias across all domains. We identified no phase III trials of etrolizumab, tofacitinib or filgotinib in luminal CD, although the BERGAMOT trial of etrolizumab has now completed. However, the results of this study are reported as having led to the pharmaceutical company ceasing future development of the drug,62 so incorporating the results from this trial is unlikely to have changed the ranking of the top therapies. The three trials of upadacitinib have yet to be published in full, and data for efficacy according to previous biologic exposure were unavailable for one of the induction of remission trials and for the maintenance of remission RCT, and there were no safety data available for the maintenance of remission trial. It may be, therefore, that when these data become available in full the ranking of upadacitinib will change in both biologic-naïve and exposed patients and safety signals may emerge. Although trials of newer drugs will have included patients with more refractory luminal CD who had failed multiple biologics, many of these recent trials also restricted their recruitment to biologic-naïve patients entirely or recruited a subset of biologic-naïve patients. This allowed subgroup analysis according to previous biologic exposure, although in the latter group of trials, comparisons between subsets of biologic-naïve or biologic exposed patients according to treatment allocation may not be protected by randomisation. Endpoints were identical between all trials for both induction and maintenance of remission but differed slightly for clinical response. However, we performed a sensitivity analysis including only trials using a fall in CDAI of ≥100 to define this. Another issue is that there was a difference in duration of treatment between induction of remission trials, as well as the timepoints at which endpoints were assessed. It may not be fair to compare the efficacy of adalimumab at 4 weeks, which is when most trials reported data, with drugs like ustekinumab, risankizumab and upadacitinib, where efficacy was assessed at 12 weeks of treatment. This is probably less of an issue in maintenance of remission trials, where most studies reported data at between 50 and 60 weeks of follow-up. Finally, efficacy of these drugs, in terms of endoscopic response and remission, which may be associated with improved prognosis and reductions in disability,63 64 cannot be judged as few trials reported rates of endoscopic improvement or healing. Despite these limitations, the results of our study may still be useful to inform treatment decisions for patients with luminal CD and can be used to update national and international evidence-based management guidelines.

A network meta-analysis by Singh et al published in 2018,15 reported that infliximab and adalimumab were the most efficacious drugs for induction of remission in patients naïve to biological therapies and for maintenance of remission after response to therapy. In this analysis, adalimumab and ustekinumab ranked highest for induction of remission in patients with previous anti-TNF-α exposure. In a recent update of this meta-analysis,14 the ranking favoured infliximab combined with azathioprine, infliximab monotherapy and adalimumab for induction of remission in 15 trials in biologic-naïve patients, and adalimumab and risankizumab in 10 trials in biologic-exposed patients. In 15 maintenance of remission trials, irrespective of previous biologic exposure and including treat-through studies without rerandomisation of patients, infliximab combined with azathioprine, and adalimumab, were the highest ranked. In contrast to these previous network meta-analyses, we identified studies from the ‘grey’ literature46 61 (NCT00291668), incorporated the more recent trials of newer drugs, conducted maintenance of remission analyses according to previous biologic exposure and only in trials re-randomising patients, and performed all our analyses according to dose, and dosing schedule, of each of the drugs of interest, rather than pooling individual drugs together irrespective of these issues. Our analyses, therefore, allow the selection of the optimal dose and treatment interval, as well as providing evidence that some novel drugs, which are likely to come to market soon, are potentially more efficacious than existing licensed therapies for both biologic-naïve and exposed individuals with luminal CD.

One of the core assumptions in any network meta-analysis relates to transitivity, where indirect comparisons between treatments assume that any patient included in the network could, theoretically, have been recruited to any of the trials and assigned to any of the treatments. Confounding due to underlying differences between RCTs, including previous failed therapies, disease duration or concomitant medication use over the 25-year range these trials were conducted, is possible. We had identified these issues a priori. Hence, our analyses including only biologic-naïve or biologic-exposed patients should address, to some extent, the concern that more refractory patients have been included in more recent trials. In fact, several of these studies restricted their recruitment to biologic-naïve patients. Disease duration was between 7 and 13 years in most studies. In addition, immunosuppressant use was similar between active drug and placebo arms across all induction of remission trials, although this was less well balanced in maintenance of remission trials. It is also important to consider that there is a lack of real-world clinical experience for the use of newer drugs or routes of administration, such as risankizumab or subcutaneous vedolizumab. Although these were ranked highly in the network, this was based on indirect evidence, and head-to-head trials versus other biologics should be considered. Finally, conclusions relating to infliximab 10 mg/kg for induction of remission were based on a single small study.9 This may, therefore, be underpowered to detect significant differences, although there are also data to support this approach from clinical practice.65

These results confirm that all available drugs, other than adalimumab 80/40 mg and certolizumab 400 mg, were more efficacious than placebo for induction of remission of moderate to severe luminal CD between 4 weeks and 16 weeks, and all drugs other than infliximab 10 mg/kg and certolizumab 400 mg were superior to placebo in terms of clinical response. For maintenance of remission, all drugs other than infliximab 120–240 mg 2-weekly, risankizumab 360 mg 8-weekly and ustekinumab 90 mg 12-weekly were superior to placebo. All drugs were safe and well tolerated. Nevertheless, blanket application of the findings of this meta-analysis should be avoided. Selection of treatment should be informed by these results together with patient choice, which may be influenced by other considerations, including route of administration and convenience, as well as likelihood of adherence, and costs to the health service. In some healthcare systems, the substantial reduction in costs seen with the advent of biosimilars will over-ride the possible superior efficacy of some of these newer, but more expensive, drugs. Where anti-TNF-α drugs are used preferentially, our results suggest that higher doses of adalimumab are more likely to induce remission successfully and that weekly scheduling of adalimumab or a 10 mg/kg dose of infliximab are more likely to maintain remission. However, these are suggestions based on our results and do not consider the results of proactive therapeutic drug monitoring, which are likely to guide decision-making in clinical practice.

This systematic review and network meta-analysis synthesises evidence from a large number of patients included in multiple induction and maintenance of remission trials, with similar patient numbers assigned to each of the active drugs of interest, and confidence in the results from of all direct and indirect comparisons across the network for achievement of clinical remission were rated as either high or moderate. Although the results can be used to help select treatments for luminal CD, more head-to-head RCTs will better inform future network meta-analyses and, ultimately, clinical practice.

Data availability statement

No data are available.

Ethics statements

Patient consent for publication

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • CJB and ACF are joint senior authors.

  • Twitter @bribarberio, @DrCJBlack

  • Contributors ACF is guarantor, accepts full responsibility for the work and the conduct of the study, had access to the data and controlled the decision to publish. The corresponding author attests that all listed authors meet the authorship criteria and that no others meeting the criteria have been omitted. Study concept and design: BB, DJG, CB and ACF conceived and drafted the study. BB, CB and ACF analysed and interpreted the data. BB and ACF drafted the manuscript. All authors approved the final draft of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.