Multi-centre derivation and validation of a colitis-associated cancer risk prediction web-tool

Background and Aims Ulcerative colitis (UC) patients diagnosed with low-grade dysplasia (LGD) have increased risk of developing advanced neoplasia (AN; high-grade dysplasia or colorectal cancer). We aimed to develop and validate a predictor of AN risk in UC patients with LGD and create a visual web-tool to effectively communicate the risk. Methods In our retrospective multi-centre validated cohort study, adult UC patients with an index diagnosis of LGD, identified from four UK centres between 2001-2019, were followed until progression to AN. In the discovery cohort (n=248), a multivariate risk prediction model was derived from clinicopathological features using Cox regression. Validation used data from 3 external centres (n=201). The validated model was embedded in a web-based tool to calculate and illustrate patient-specific risk. Results Four endoscopic variables were significantly associated with future AN progression in the discovery cohort: endoscopically visible LGD > 1 cm (HR = 2.8; 95% CI 1.3-6.0), incomplete endoscopic resection (HR = 2.9; 95% CI 1.3-6.5), moderate/severe histological inflammation within 5 years of LGD diagnosis (HR = 3.0; 95% CI 1.3-6.7), and multifocality (HR = 2.8; 95% CI 1.3-6.1). In the validation cohort, this 4-variable model accurately predicted future AN cases with overall calibration Observed/Expected = 1 (95% CI 0.63-1.5), and achieved perfect specificity for the lowest predicted risk group over 13 years of follow-up. Conclusion Multi-cohort validation confirms that patients with large, unresected, and multifocal LGD and recent moderate/severe inflammation are at the highest risk of developing AN. Personalised risk prediction provided via the Ulcerative Colitis-Cancer Risk Estimator web-tool (www.UC-CaRE.uk) can be used to support treatment decision-making.


Introduction
Patients with Ulcerative Colitis (UC) have an increased lifetime risk of developing colorectal cancer (CRC) and of CRC-related death [1][2][3] . Consequently, UC patients are advised to engage in a colonoscopic surveillance programme 8-10 years after diagnosis [4][5][6][7] . In the last decade, advances in colonoscopic surveillance imaging with the use of high definition and chromoendoscopy have increased detection rates of pre-cancerous dysplastic lesions that may have the potential to progress to adenocarcinomas 8 . While high-grade dysplasia (HGD) warrants preventive surgery (or endoscopic resection with intensive surveillance) due to imminent CRC risk [4][5][6][7] , the natural history of low-grade dysplasia (LGD) progression is less well defined. Consequently, managing CRC risk in patients with LGD is extremely challenging. Reported rates of progression of LGD lesions to advanced neoplasia (HGD or CRC) have varied widely, being as low as 0% despite a median followup duration of 17.8 years 9 , or as high as 53% after 15 months median follow-up 10 . In a meta-analysis of cohort studies of UC patients with LGD, the pooled incidence of CRC and advanced neoplasia (AN) were 0.8 per 100 patient-years (95% confidence interval [CI] 0.4-1.3) and 1.8 per 100 patient years (95% CI 0.9-2.7) respectively 11 . In a Dutch population-based cohort study the cumulative incidence of subsequent AN was found to be 3.6, 8.5, 14.4 and 21.7%, after 1, 5, 10 and 15 years respectively 12 .
A number of clinicopathological variables are reported to be associated with AN progression after LGD diagnosis: patient specific characteristics such as age ≥ 55 years, male sex, follow-up at an academic (vs nonacademic) medical centre, concomitant Primary Sclerosing Cholangitis (PSC) and endoscopic characteristics of the index LGD such as non-polypoid morphology, invisibility (i.e. detected on random mucosal biopsy with no associated visible lesions), size greater than 10mm, multifocality, presence of a stricture, and distal location 8,11,12 . However, the associations were based on historical data pre-dating the year 2000, which consequently do not necessarily reflect the endoscopic advances adopted into practice in the last two decades, such as high definition chromoendoscopy and endoscopic resection techniques, such as endoscopic submucosal dissection. These advances have been linked with lower rates of invisible dysplasia detection and lower AN progression rates [13][14][15] .
Patients are reluctant to consider surgical management even when the risks of CRC are high due to concerns about the negative impact that complications, stoma or ileoanal pouch function may have on their quality of life, given that they are often in clinical remission at the time of dysplasia detection [16][17][18][19] . Shared clinicianpatient decision-making is particularly important when the evidence and best management option is unclear and there are potentially harmful consequences associated with the choice that is eventually made. This is unfortunately the case for management of LGD in UC: the risks and consequences of developing CRC despite surveillance must be balanced against having a life-changing surgical operation that may not be warranted. Communication of uncertainty or ambiguity in individualised CRC risk estimates can lead to increased cancer-related worry and "ambiguity aversion" i.e. patients avoiding decision-making 20 . Providing evidencebased and individualised numerical CRC risk estimates has been reported by patients to facilitate shared decision-making 19 . Visual decision aids that allow patients to view their individualised CRC risk in a graphical or pictorial form also promote patient engagement with decision-making 20 .
Here, we aimed to identify the factors that can predict AN progression in UC patients diagnosed with LGD in the 21 st century, and to create an online simple and visual multivariate risk prediction model to communicate the patient-specific risk. We created the Ulcerative Colitis-Cancer Risk Estimator (UC-CaRE) web-based application that is publicly accessible at www.uc-care.uk and can be used by clinicians to aid dysplasia/CRC risk communication, patient education and shared decision-making.

Study design and patient cohort identification
A retrospective cohort study of Ulcerative Colitis (UC) patients diagnosed with an index case of low-grade dysplasia (LGD) at four tertiary Inflammatory Bowel Diseases (IBD) centres in the UK was undertaken. The four centres were St Mark's Hospital (London North West University Healthcare NHS Trust), Royal London Hospital (Barts Health NHS Trust), the John Radcliffe Hospital (Oxford University Hospitals NHS Trust) and University College London Hospital NHS Trust. Hospital pathology databases were searched using the following terms to identify patients with UC who had been diagnosed with LGD: 'ulcerative colitis' or 'inflammatory bowel disease' and 'dysplasia', 'low-grade dysplasia', 'adenocarcinoma' or 'dysplasia associated mass lesion (DALM)'. The searched time periods were marginally different between each site and are detailed in Figure 1 o confirmed by a second gastrointestinal histopathologist o located within the known histological extent of colitis based on historical pathology reports; § The patient had at least one follow-up examination of the whole colon after the index LGD diagnosis, either by colonoscopy or pathological analysis of a surgical colectomy specimen.

Exclusion criteria:
§ The patients had a diagnosis of Crohn's disease, IBD-unclassified or indeterminate colitis; § The index LGD was: o located proximal to the known extent of historical microscopic inflammation as these were classed as sporadic adenomas; o diagnosed after a panproctocolectomy i.e. was first noted incidentally within the surgical colonic specimen; o diagnosed at the same time as or after another more advanced neoplastic lesion (either high-grade dysplasia or adenocarcinoma) o diagnosed at an external institution to one of the study centres, and the exact date of onset was unclear; § There was no adequate follow-up examination of the colon after the index LGD diagnosis.

Data collection
The clinical notes, endoscopy and histology reporting systems at each centre were interrogated to collect data for the following variables: patient age and duration of UC at time of index LGD diagnosis; patient gender; concomitant Primary Sclerosing Cholangitis (which had been radiologically or histologically confirmed); patient exposure to 5-aminosalicylate, immunomodulator (thiopurines and methotrexate) and biological medications (anti-tumour necrosis factor, anti-interleukin and anti-integrin agents); macroscopic morphology of the index LGD as per the Paris classification 21 (polypoid, non-polypoid or invisible); size and location of the largest visible index LGD; multifocality; completion of any endoscopic resection undertaken; presence of any histological active inflammation in the colon at the time of or within the previous 5 years of the index LGD; any chronic features of inflammation (colonic stricture, post-inflammatory polyps, scarred colon or a tubular and shortened colon); a previous diagnosis of indefinite for dysplasia; and use of chromoendoscopy during any surveillance colonoscopy performed before, at time of or after the index LGD diagnosis.
Dysplasia was categorised as invisible if it was detected on random mucosal biopsy with absence of a corresponding visible lesion. If the lesion was found to be visible on targeted colonoscopy re-examination . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . within 3 months, the lesion categorisation was changed from invisible to visible polypoid or non-polypoid. When multiple LGD lesions were found, categorisation of the morphology was based on the lesion considered to have more carcinogenic potential (from an earlier St Mark's cohort 8 ) in descending order of non-polypoid, invisible and polypoid morphology. The index LGD was categorised as multifocal LGD if more than one LGD discrete visible lesions were detected on index colonoscopy, regardless of the colonic segment, or foci of invisible LGD were detected in more than one colonic segment. Endoscopic resection where possible was based on histological confirmation of complete endoscopic resection, but often this could not be confirmed due to piecemeal resection or diathermy artefact so completion of resection was based on endoscopic criteria. Invisible LGD was also categorised as 'incomplete resection'. If there was multifocal LGD whereby a visible lesion was successfully endoscopically resected but there was another focus of invisible LGD, this was categorised as 'incomplete resection'. A lesion was considered a postinflammatory polyp (PIP) if only inflammatory and/or granulation tissue and no neoplastic tissue was detected histologically within the lesion. Patients were recorded as having multiple PIPs if the endoscopist reported on there being 'a few', 'several', 'many' or 'multiple' PIPs within the colon.
The cases that had missing data from one or more of the variables required for the multivariate analysis were excluded from the final analysis.

Follow-up outcomes
End of surveillance follow-up was determined by the date of the first incidence of advanced neoplasia (either HGD or CRC) or censoring at the last surveillance colonoscopy or surgical colectomy date.

Statistical analysis of patient cohorts
The St Mark's cohort of patients was used as the discovery set, and the patient cohorts from the three other centres were pooled together to form a validation set. Differences between the patient cohort's clinical characteristics were assessed using Chi-squared tests for categorical variables and Mann-Whitney U tests for non-parametric continuous variables (significance required p < 0.002 Bonferroni multiple testing correction). Data analysis was performed using SPSS (IBM SPSS Statistics for Macintosh, Version 25.0. Armonk, NY). Incidence rates of advanced neoplasia with 95% confidence interval [CI] were determined using OpenEpi software 22 .

Statistical model selection and validation
In the discovery set, 16 clinical variables (Table 2) were tested for associated with AN risk using univariate Cox proportional hazard (PH) models (significance required p < 0.003 Bonferroni multiple testing correction). Significantly associated variables were included in a multivariate Cox PH model, and individual patient risk scores were computed. Kaplan-Meier (KM) estimation and log-rank tests were used to compare survival between dichotomised risk groups in discovery and validation sets. Positive and negative predictive values (PPV/NPV respectively) were assessed from KM curves to evaluate predictive power in the validation set. Survival analysis was carried out using the survival and survminer packages for R version 3.6.1. Estimation of cumulative incidence functions from the competing risk scenario of AN progression and colectomy during follow-up was performed using R package cmprsk.

UC-CaRE risk prediction model development using discovery data
The UC-CaRE web tool was created to make patient-specific AN risk prediction. The multivariate model above was embedded in a web-tool that takes patient specific features as user input, produces a cumulative AN risk curve into future years of follow-up, and displays a Paling chart illustrating the patient's individual risk (see Supplementary material for prognostic risk function derivation).

Evaluation of UC-CaRE risk predictions in validation data
We evaluated the risk predictions produced by the UC-CaRE tool in the independent validation dataset by computing the observed versus expected cumulative number of progressors to AN at 13 years post-baseline index LGD. The methods for assessing predictive model calibration are based on cumulative AN progressionspecific hazards and may be used to assess calibration both overall and in risk score subgroups of the validation data (see Supplementary material for detailed model calibration methods).

Ethical Considerations
The study was approved by the UK Research Ethics Committee (REC reference: 17/EM/0289; IRAS project ID: 227613).

Study patient clinical characteristics
A total of 460 patients were followed for 2,200 patient-years. There were 249 patients from St Mark's Hospital (discovery cohort) and 211 patients in the multi-centre validation cohort ( Figure 1) and detailed clinical characteristics were collected on each patient (Table S1, Figure S1). In the discovery cohort 7% (n=18/247) of the index LGD was invisible. Eighty-four percent (n=209/249) of the LGD was resected endoscopically. After LGD diagnosis, patients had a median of 4 follow-up colonoscopies (IQR 2.0 -7.0) and a median follow up period of 5.1 years (IQR 2.3 -8.5). Twenty percent (n=51/249) eventually had a colectomy performed due to dysplasia or symptomatic disease after the index LGD diagnosis. Five percent (n=12/249) developed HGD during the follow-up period and 7% (n=18/249) developed CRC.
There was significant heterogeneity in the clinical and endoscopic characteristics between the discovery and validation cohorts as detailed in Table 1. However, there were no significant differences in the incidence rates of AN and CRC between the two cohorts. The incidence rates of AN per 100 patient-years for the discovery cohort (n=249), validation cohort (n=211) and the total cohort (n=460) were 2.2 (95% CI 1.5 -3.1), 3.1 (95% CI 2.0 -4.5) and 2.5 (95% CI 1.9 -3.2) respectively. The incidence rates of CRC per 100 patient-years for the discovery cohort, validation cohort and the total cohort were 1.2 (95% CI 0.8 -2.0), 1.8 (95% CI 1.0 -2.9) and 1.5 per 100 patient-years (95% CI 1.0 -2.1) respectively. Incidence rate of AN was 1.4 per 100 patient-years (95% CI 0.9-2.0) after endoscopic resection of LGD and 8.9 per 100 patient-years (95% CI 6.0-12.6) if the LGD could not be completely endoscopically resected. Incidence rate of CRC was 0.8 per 100 patient-years (95% CI 0.5-1.3) after endoscopic resection of LGD and 5.4 per 100 patient-years (95% CI 3.2-8.4) if the LGD could not be completely endoscopically resected.

Predictors of progression of low-grade dysplasia to advanced neoplasia
Four variables were found to be significantly predictive of progression to AN on univariate analysis of the discovery set (Table 2) and were entered into a multivariate model (Table 3, one patient removed due to missing LGD size data). All four variables remained significant predictors of AN progression: size of any visible index LGD being 1cm or greater [adjusted HR 2.8 (95% CI 1.3 -6.0); p=0.008]; incomplete resection of the index LGD by endoscopic criteria [adjusted HR 2.9 (95% CI 1.3 -6.5); p=0.009]; multifocal index LGD [adjusted HR 2.8 (95% CI 1.3 -6.1); p=0.007]; and presence of moderate or severe active histological inflammation at the time of or within the previous 5 years of the index LGD diagnosis [adjusted HR 3.0 (95% CI 1.3 -6.7); p=0.009].
To validate the multivariate model's predictions, we turned to the validation set. Figure S2 depicts similar estimated baseline hazard functions for both cohorts. Individual patient risk scores for the validation set were calculated and the predicted AN risk curves are illustrated in Figure S3. We computed the observed (O) vs expected (E) standardized incidence ratio, O/E, on the validation set, finding O/E = 1 (95% CI 0.63 -1.5) confirming the model's efficacy.

Risk stratification with simple risk score in discovery and validation sets
We assigned a risk score to each patient based on the number of risk factors present (0 -4 possible in total), combining patients with 3 or 4 risk factors due to low numbers. Kaplan-Meier (KM) curves for the risk tiers ( Figure 2) in discovery vs validation sets confirmed very similar risk profiles in both cohorts (log-rank p < 0.0001 in both cohorts). Similar results were found when stratification was performed for 5 risk tiers (0-4) or 3 risk tiers (0, 1-2, 3+) ( Figure S4).
We computed predictive values for the highest and lowest risk groups in the discovery set and then found similar results for predictive power in the validation set (Table S2). Reassuringly, the group with lowest risk score = 0 (n = 54) in the validation set had a negative predictive value of 1 through all years of follow-up, i.e., no patient in this group progressed to AN thus we determined lowest risk with perfect specificity using our model in this validation group. For the highest risk group, risk score = 3+ (n=28), in the validation set we found positive predictive values (PPV) of PPV = 12% by 6 months of follow-up, PPV = 16% by year 1, PPV = 33% by year 3, and PPV = 44% by year 5.

UC-CaRE risk prediction web-tool development
We built a webtool named UC-CaRE to be used by a clinician to predict and display risk of AN risk for a UC patient with LGD. The tool takes the 4 patient-specific variables included in the final multivariate model as user input, and computes the function Risk(t) for probability of AN progression at time t based on those variables and the baseline hazard (see Materials and Methods and Supplementary material). Risk estimates are displayed as risk prediction curves (Figure 3), and also demonstrated with the aid of a diagram of 100 patients with the same risk, coloured according to how many of the total will likely develop an advanced neoplasm in 1, 5, and 10 years (Figure 4). This latter type of visual aid (also known as a Paling chart) can be helpful for patients to understand the meaning of a probability of cancer occurrence by viewing a simple diagram of predicted outcomes for 100 similarly at-risk UC patients (Figure 4). The 'risk report' summarizing the UC-CaRE output can be downloaded as a pdf file for ease of display and recording purposes.
Finally, we conducted a competing risk analysis for time to colectomy versus risk of developing AN. The hazard ratios, based on the four-risk factors above, were similar for both events ( Figure S5). Thus, our findings suggest that our risk score predicts colectomy risk equivalently to predicting AN risk (Table S4 and Supplementary material)

Discussion
We designed and validated a cancer risk prediction tool UC-CaRE (Ulcerative Colitis-Cancer Risk Estimator) using multi-centre data from UC patients diagnosed with low-grade dysplasia. We intend this tool to enable patients to make a more informed choice to either accept colectomy or continue endoscopic surveillance based upon their personalised risk of developing advanced neoplasia (AN). Clinicians currently lack an athand way to calculate and communicate an individual's AN risk; UC-CaRE addresses this area of clinical need so that clinicians can make quantitative predictions of a UC patient's risk of developing AN at point of care. The estimated absolute risk of AN at future years of surveillance can be easily demonstrated to a patient with a user-friendly visual aid to facilitate shared decision-making.
The multivariate model we developed to embed in our tool was trained using current St. Mark's Hospital data, and then tested and validated in a dataset taken from three independent UK tertiary care centres specialising in inflammatory bowel disease. The model remained highly accurate in the validation set: we could predict progression for the first year of follow-up with positive predictive value 16% in the highest risk score group and negative predictive value 100% in the lowest risk score group; this information about risk will be very useful for patients faced with an imminent choice for management after diagnosis. We observed increasing accuracy at longer times in this cohort with positive predictive value reaching 44% by year 5 (and 33% by year 3), while perfect specificity of the lowest risk score group was sustained past 10 years of followup. This confirms that baseline findings are highly predictive in UC patients with LGD, and this should be considered when determining their personalised treatment and surveillance scheduling.
We recognised that censoring patients at colectomy, before they have had time to progress to AN, was a competing risk for patients in our study. Our results confirm that the risk of both events (colectomy or AN progression) was similar, even when stratified by AN risk group. Thus, colectomy decisions made in the absence of an AN diagnosis are likely preventing AN development, with minimal over-treatment. Thus, it is reasonable to suggest that progression to AN could have been prevented by earlier colectomy in patients identified as high-risk by the UC-CaRE tool.
The St Mark's patient dataset used in the discovery set overlaps with a previously reported cohort study of 172 UC patients with LGD diagnosed between 1993 to 2012 by Choi et al. 8 . In this older study, lesion size greater than 10mm (HR 10.0; 95% CI 4.3-23.4) and multifocality (HR 5.0; 95% CI 1.9-7.8) were also found to be significant predictive factors for AN progression. In Fumery et al.'s meta-analysis 11 of LGD outcomes, multifocality (OR 3.5; 95% CI 1.5-8.5) was also a significant predictive factor for AN development. In these two studies LGD morphology was found to be an additional risk factor on multivariate analysis. Choi et al 8 reported non-polypoid morphology (HR 16.5; 95% CI 6.8-39.8) and Fumery et al. 11 reported invisibility (OR 1.87; 95% CI 1.04-3.36) as being predictive of AN progression. Strengths of our more recent study are that we have additionally evaluated the impact of endoscopic resectability on LGD prognosis and have only included LGD cases diagnosed within the extent of colitis and in the 21 st century. We note that endoscopic unresectability has not been assessed in previous studies 8,11,12 but is an indication for colectomy surgery to prevent CRC progression [4][5][6][7] . Chromoendoscopy was adopted into routine surveillance practice from 2003 onwards at St Mark's Hospital and true high-definition imaging processors have been available from 2012 onwards. The development of advanced endoscopic resection techniques of non-polypoid dysplasia such as endoscopic submucosal dissection (ESD) and hybrid ESD/endoscopic mucosal resection (EMR), have allowed a greater number of these lesions to be endoscopically resected, when previously they would have been consigned to colectomy surgery 15 . More recent case series and smaller cohort studies from centres where high definition chromoendoscopy surveillance and advanced endoscopic resection techniques have been used have demonstrated lower rates of truly invisible dysplasia detection and lower AN progression rates after dysplasia has been endoscopically resected [13][14][15] . Our study findings suggest that whether or not a LGD lesion can be endoscopically resected is a more prominent risk factor than its morphology. After inclusion of endoscopic unresectability (which included invisible LGD) as a variable into our multivariate model, the morphology of the LGD lesion no longer was a significant predictive factor.
We have reported incidence rates of LGD progression to AN and CRC of the total cohort (n=460) as being 2.5 and 1.5 per 100 patient-years follow-up respectively. These rates are higher than reported in Fumery et al.'s meta-analysis 11 where the pooled AN and CRC incidence rates per 100-patients years were 1.8 and 0.8 respectively, but there was substantial calculated between-study heterogeneity (I 2 statistic > 60%) and studies that included LGD proximal to the colitis extent were included. The inclusion of these latter cases may also explain why distal location of the LGD was found to be a predictive risk factor for progression to AN in the meta-analysis 11 but not with our study cohort. The incidence rate of CRC (1.4 per 100 patient-years) found in a Dutch population-based cohort study of 4284 IBD patients with LGD diagnosed at both academic and non-academic centres is very similar to ours 12 . There are a paucity of cohort studies reporting on CRC incidence rates after endoscopic resection of non-polypoid dysplasia. Our incidence rate of CRC progression after endoscopic resection of both polypoid and non-polypoid dysplasia (0.8 per 100 patient years) was very similar to the pooled incidence calculated in a meta-analysis of endoscopically resected polypoid only dysplasia (0.5 per 100 patient years) 23 .
It is important to note limitations of our study. This was a retrospective study relying on the accuracy of the available medical documentation, and incomplete medical records meant that other important risk factors for CRC development, such as family history of CRC could not be included in the risk prediction model. Our study had only a modest number of PSC patients (n = 13) available to use in the discovery cohort, and we note that Fumery et al.'s meta-analysis found that concomitant PSC (OR 3.4; 95% CI 1.5-7.8) was a significant risk factor for AN progression 11 , therefore is likely UC-CaRE will underestimate AN risk in PSC patients. Lastly, by only including tertiary IBD centres in both discovery and validation cohorts, our results may be limited by selection bias. However, we demonstrated significant heterogeneity between the two cohorts and still found that the UC-CaRE model can accurately predict risk groups. We have also discussed above that the CRC incidence rates found with our study cohort were similar to that found in De Jong et al.'s population-based cohort study 12 . Further validation of the tool using data from non-tertiary centres will certainly test its applicability for use outside of the tertiary care setting.
In summary we have derived and created a simple to use web-tool, UC-CaRE, for the calculation of patient specific high-grade dysplasia and/or CRC risk in individuals with UC and LGD. We hope the tool will be useful as an adjunct by clinicians when managing CRC risk together with their patients.

A B
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.04. 10.20057869 doi: medRxiv preprint  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.