Article Text

Download PDFPDF

Original article
Combining risk factors with faecal immunochemical test outcome for selecting CRC screenees for colonoscopy
  1. Inge Stegeman1,2,
  2. Thomas R de Wijkerslooth3,
  3. Esther M Stoop4,
  4. Monique E van Leerdam4,
  5. Evelien Dekker3,
  6. Marjolein van Ballegooijen5,
  7. Ernst J Kuipers4,6,
  8. Paul Fockens3,
  9. Roderik A Kraaijenhagen2,
  10. Patrick M Bossuyt1
  1. 1Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands
  2. 2Department of Research, NDDO Institute for Prevention and Early Diagnostics, Amsterdam, The Netherlands
  3. 3Department of Gastroenterology and Hepatology, Academic Medical Center, Amsterdam, The Netherlands
  4. 4Department of Gastroenterology and Hepatology, Erasmus MC University Medical Centre, Rotterdam, The Netherlands
  5. 5Department of Public Health, Erasmus MC University Medical Centre, Rotterdam, The Netherlands
  6. 6Department of Internal Medicine, Erasmus MC University Medical Centre, Rotterdam, The Netherlands
  1. Correspondence to Inge Stegeman, Clinical Epidemiology, Biostatistics and Bioinformatics (room J1b-210-1), Academic Medical Center, Meibergdreef 9, Amsterdam 1105 AZ, The Netherlands; i.stegeman{at}


Objective Faecal immunochemical testing (FIT) is increasingly used in colorectal cancer (CRC) screening but has a less than perfect sensitivity. Combining risk stratification, based on established risk factors for advanced neoplasia, with the FIT result for allocating screenees to colonoscopy could increase the sensitivity and diagnostic yield of FIT-based screening. We explored the use of a risk prediction model in CRC screening.

Design We collected data in the colonoscopy arm of the Colonoscopy or Colonography for Screening study, a multicentre screening trial. For this study 6600 randomly selected, asymptomatic men and women between 50 years and 75 years of age were invited to undergo colonoscopy. Screening participants were asked for one sample FIT (OC-sensor) and to complete a risk questionnaire prior to colonoscopy. Based on the questionnaire data and the FIT results, we developed a multivariable risk model with the following factors: total calcium intake, family history, age and FIT result. We evaluated goodness-of-fit, calibration and discrimination, and compared it with a model based on primary screening with FIT only.

Results Of the 1426 screening participants, 1112 (78%) completed the questionnaire and FIT. Of these, 101 (9.1%) had advanced neoplasia. The risk based model significantly increased the goodness-of-fit compared with a model based on FIT only (p<0.001). Discrimination improved significantly with the risk-based model (area under the receiver operating characteristic (ROC) curve: from 0.69 to 0.76, (p=0.02)). Calibration was good (Hosmer-Lemeshow test; p=0.94).

By offering colonoscopy to the 102 patients at highest risk, rather than to the 102 cases with a FIT result >50 ng/mL, 5 more cases of advanced neoplasia would be detected (net reclassification improvement 0.054, p=0.073).

Conclusions Adding risk based stratification increases the accuracy FIT-based CRC screening and could be used in preselection for colonoscopy in CRC screening programmes.

  • Screening
  • Cancer Epidemiology
  • Cancer Prevention
  • Colorectal Cancer
View Full Text

Statistics from

Significance of this study

What is already known about this subject?

  • Faecal immunochemical testing (FIT) is increasingly used in colorectal cancer (CRC) screening.

  • FIT has a low sensitivity rate for detecting advanced neoplasia.

What are the new findings?

  • A multivariable risk model, including the FIT result and risk factors for CRC, has a better sensitivity than using FIT only in CRC screening.

  • Information on the risk factors in this model can be easily obtained with a questionnaire that is distributed with the FIT test.

How might it impact on clinical practice in the foreseeable future?

  • Combining risk stratification with the FIT result has better accuracy than FIT-only screening, with better discrimination, better sensitivity at similar specificity levels, and more cases of advanced adenoma detected with a similar number of colonoscopies. The algorithm provides the means to have an enriched sample of higher risk patients undergoing colonoscopy, and thereby increases the yield for advanced neoplasia.


Colorectal cancer (CRC) is one of the leading causes of cancer related death.1 ,2 Detecting cancer or one of its precursors at an early stage can prevent premature death and may reduce cancer morbidity, since treatment for earlier-stage cancers is often less aggressive than that for more advanced-stage cancers.3 The high incidence of CRC, the high burden of disease, the availability of screening tests and of effective preclinical treatment of adenomas and early-stage cancers are reasons why population screening for CRC is deemed appropriate.

Colonoscopy serves as the reference standard for the detection of advanced adenoma and cancer in CRC screening. Colonoscopy is a burdensome and costly procedure, and colonoscopy capacity is limited, so it should only be considered in those at increased risk for CRC and adenomas.4 ,5 In most countries the invitation to colorectal population screening is therefore based on age criteria. In the Netherlands a CRC screening programme will start in 2013 to which men and women between 55 years and 75 years of age will be invited. Screening is not offered to younger or older participants, because the benefits in these age groups are considered not to outweigh the harms. The benefits are considered too small in the young, and too limited in the elderly.

One could argue that it is not so much age which should guide the criteria for invitation, but a more comprehensive assessment of risk, based on known determinants of the benefits and harms. Such a preselection can be used to improve the benefits-harms balance in screening programmes, by identifying more adequately the people most likely to benefit from screening.

In CRC screening programmes preselection for colonoscopy is most often done by faecal immunochemical tests (FITs). Individuals with a positive FIT test are invited to undergo a colonoscopy; those with a negative test are not. It is known that FIT has non-optimal sensitivity rates.6

Over the years, several CRC risk factors have been identified in epidemiological studies. These include physical activity, smoking, Body Mass Index (BMI) and nutritional habits.7–9 Preselection based on these established risk factors for advanced adenomas and CRC, complementing FIT, could increase the sensitivity of a screening programme. This way, the diagnostic yield from screening could be increased with a similar number of colonoscopies.

We explored the potential gains of using a risk prediction model in CRC screening. Data were collected in a randomised CRC screening pilot, in which primary colonoscopy was used as a screening method. All individuals that underwent colonoscopy were asked to complete a risk questionnaire and a FIT.


Study population

Data were collected in the Colonoscopy or Colonography for Screening (COCOS) study, a multicentre population-based CRC randomised screening trial in the Netherlands.10 In this trial participation and yield in a population based CRC screening programme were compared between colonoscopy or CT colonography as primary screening methods.11 At the time of the study, the Netherlands did not have a nationwide CRC screening programme. The COCOS study is described in detail elsewhere.10 In summary, 6600 asymptomatic men and women, between 50 years and 75 years of age, were randomly selected and invited for colonoscopy. Invitees who had a full colonic exam in the previous 5 years were excluded, as well as subjects who were in a colonoscopy surveillance programme and those with a life expectancy less than 5 years. In this analysis we only include invitees for the primary colonoscopy arm of the trial. All participants gave written informed consent.

Risk factor information

Invitations were sent by mail by the Regional Comprehensive Cancer Centre in Amsterdam and Rotterdam. Two weeks before the invitation all invitees received a preannouncement. Invitees had three options to respond: using the reply card, calling the Comprehensive Cancer Centre or sending an email. The Comprehensive Cancer Centre made an appointment for a prior consultation. During this consultation family history and medical history were discussed, and the patient was informed about the bowel preparation and the colonoscopy procedure. Non-responders received a reminder 4 weeks after the invitation.

Risk factors were selected based on the results of a previous analysis of variables associated with advanced neoplasia.12 Eligible risk factors were those that could be assessed without additional testing. Risk factor information was collected via a questionnaire, which was handed out to the participants in the waiting room before the colonoscopy.

We evaluated the following risk factors: age, CRC family history (first degree), alcohol intake, current smoking, history of smoking, BMI, regular aspirin or non steriod anti inflammatory drug (NSAID) use, total calcium intake and physical activity. Regular NSAID intake was defined as the use of NSAIDs three or more times a week during the last month. Calcium intake was estimated by questions about food and supplement intake. More detailed information about the assessment and definition of risk factors can be found elsewhere.10–12


Colonoscopies were performed using the standard quality aspects defined by the American Society for Gastrointestinal Endoscopy.13 Participants were prepared for colonoscopy by a low fibre diet and 2 L of hypertonic polyethylene glycol solution (Moviprep; Norgine bv, Amsterdam, The Netherlands). Histology was defined according to the Vienna criteria.14

Statistical analysis

We developed a risk-based preselection model for colonoscopy that was based on the associations between each of the putative risk factors and the presence of advanced neoplasia during colonoscopy, on a per patient basis. In these analyses the most advanced lesion per patient was used. Advanced neoplasia was defined as at least one CRC or advanced adenoma: adenoma of 10 mm or larger, ≥25% villous histology or high grade dysplasia. CRC and advanced adenoma were reported separately.

Missing data in the questionnaires were handled by multiple imputation.15 In multiple imputation, missing values are estimated from other related variables in the dataset. With this several complete datasets are created, in which different imputations are based on a random draw from different estimated underlying distributions.16

We first built a logistic regression model with the FIT result as predictor and advanced neoplasia (vs no advanced neoplasia) as the dependent variable, adjusting for age. We then developed a risk preselection model adding all risk factors from the questionnaire in a multivariable logistic regression model, in addition to the FIT result, again using advanced neoplasia as the dependent variable. We used restricted cubic splines to evaluate deviations of linearity for the continuous variables. Backward elimination was used to develop a parsimonious model, using a significance level of 0.20 as the removal criterion.

We corrected for optimism by penalised shrinkage. In penalised shrinkage regression coefficients are estimated with penalised maximum likelihood, the optimal penalty factor is determined with the akaike information criterion.17

Model performance was assessed in terms of model fit, discrimination, calibration, the distribution of risk and as net reclassification improvement (NRI). We compared goodness-of-fit of the risk prediction model with that of the FIT-only model using the generalised likelihood ratio test statistic and tested for statistical significance. Discrimination refers to the ability of the model to assign higher risk to patients that have advanced neoplasia than to those who do not.18 We used the concordance or c-statistic, the most commonly used statistic to express discriminatory power, which is also known as area under the receiver operating characteristic (ROC) curve (AUC). We also compared the sensitivity of the risk-model at the specificity that would correspond to using a fixed FIT positivity threshold of 50 ng/ml. We used a threshold of 50 ng/ml because this was the anticipated cut-off for the Dutch screening programme at the time of the study.

We evaluated calibration. Calibration refers to the level of agreement between the observed and predicted proportion of patients with advanced adenoma.18 The distribution of risk in the target population is relevant for evaluating the potential usefulness of the model. Ideally, risk modelling should result in a relatively wide risk distribution.19

In NRI a comparison is made between classifications from two models in terms of net calculation of changes in the right direction.18 ,20 All authors had access to the study data and have reviewed and approved the final manuscript.


Study population

In total 6600 persons were invited for primary colonoscopy screening, of which 1426 (22%) agreed to undergo colonoscopy. In this group, 1236 (87%) individuals completed the questionnaire and 1112 (90%) of them also completed the FIT test. Their mean age was 60.6 (SD 6.2); 543 (49%) were female. The mean BMI in participants was 26.6 kg/m2 (SD 4.1). Fifteen percent of them had smoked or were a current smoker. Reported mean alcohol consumption was 7.9 glasses a week (SD 9.2). Of the 1112 respondents, 101 (9.1%) had advanced neoplasia: 7 had CRC (4 men, 3 women), 94 one or more advanced adenoma (52 men, 42 women).

Risk model

Using restricted cubic splines we observed that a square root transformation of the FIT result better fitted linearity. After backwards elimination, the following risk factors were found to be significantly associated with advanced neoplasia, as detected during a screening colonoscopy: FIT result, age, calcium intake, number of family members with CRC and past or current smoking. BMI, menopausal status, fibre intake, aspirin/NSAID use and red meat intake could be removed from the model. Table 1 shows the ORs for the variables in the final multivariable risk model. Given the range of FIT results (from 0 ng/mL to 3351 ng/mL), these results show that FIT, compared with the other variables, is the most influential element in this multivariable model. If we compare the risk associated with the highest FIT result against that for the lowest FIT result, the OR is 71.2.

Table 1

Risk factors in the study group

Risk profiles: five examples

With our risk model we calculated the probability of advanced neoplasia at colonoscopy for each participant in the study group. Figure 1 shows the distribution of the calculated probabilities in the 1112 screening participants. The mean probability is 0.09, which corresponds to the observed proportion of advanced neoplasia cases. The probabilities range from a minimum of 0.004 to a maximum of 0.843, with 50% of the probabilities in the 0.05 to 0.843 interval.

Figure 1

Distribution of calculated probability.

As an illustration we provide examples in table 2. This table shows five risk profiles for participants in the study group and the corresponding probability, as calculated with the model. The first profile, with the highest risk, describes a 67-year-old person, with a FIT result of 439 ng/mL, who smokes and has two first-degree family members with CRC. This individual has a low calcium intake of 860 mg and the risk of having advanced neoplasia is 0.843—the highest risk in the model. Profile five corresponds to a person who is 64 years of age, had a FIT result of 0, did not smoke, and has no family members with CRC and a high calcium intake. The corresponding probability of having advanced neoplasia is 0.004.

Table 2

Five patient profiles


Discrimination expresses how well the risk model distinguishes between cases and non-cases. The AUC of the FIT-only model was 0.69. The AUC of the risk model was 0.76. Figure 2 shows both AUCs. These AUCs are significantly different (p=0.02), indicating that the risk based model discriminates better between those with and without advanced neoplasia than using FIT only. At the specificity that corresponds to a positivity threshold of 50 ng/mL (93%), the sensitivity of using the risk based model would be 40%. FIT sensitivity with a threshold of 50 ng/mL is 32%.


Calibration refers to the level of agreement between calculated risk and observed outcomes. Calibration was good (Hosmer-Lemeshow test; p=0.94). Figure 3 shows the calibration plot of the risk model, where the predicted and observed proportions of participants with advanced adenoma are shown in five disjoint subgroups, defined by the quintiles of calculated risk.

Figure 3

Calibration plot of the model.

Net reclassification improvement

In an actual application, we could offer colonoscopy to those at highest risk, rather than to those whose FIT result exceeds a prespecified threshold. We considered a FIT test result as positive when it was equal or higher than 50 ng/mL. At that threshold, 102 participants (10%) would be considered FIT test positives. We then identified the risk threshold at which an identical number of participants (102) would be classified as risk positives, that is, with a calculated risk exceeding the risk positivity threshold. By ranking the participants according to their risk, as calculated with the multivariable model, and identifying the 102 with the highest risk, we could define a risk positivity threshold of 0.19; 102 participants (10%) had a calculated risk that was higher than 0.19. These are called risk positives.

Table 3 shows the reclassification of participants. By offering colonoscopy to the 102 patients at highest risk, rather than to the 102 cases with a FIT result >50 ng/mL, 25 different screening participants would be invited for colonoscopy; 25 would be risk positive without being FIT positive, and for 25 the reverse holds. A total of 77 participants would be FIT-positive and risk-positive. With risk-based screening, the same number of colonoscopies would lead to the detection of five more cases of advanced neoplasia.

Table 3

Reclassification in participants with and without advanced neoplasia

We calculated the NRI for risk-based screening, compared with FIT-only screening. Of the individuals with advanced neoplasia 34% would be correctly classified with FIT-only screening and in risk-based screening. In the group of individuals with advanced neoplasia 7% of FIT negatives were correctly reclassified as having advanced neoplasia with risk-based screening, while 2% were FIT positive and risk negative. In individuals without advanced neoplasia 92% were correctly identified in FIT-only screening and in risk-based screening. In the group of individuals without advanced neoplasia, 2.3% of FIT positives were correctly reclassified as having no advanced neoplasia in risk stratification; 1.8% were correctly classified in FIT-only screening and incorrectly classified in risk-based screening. Classification improved with risk-based screening, but the improvement was not significant; NRI was 0.054 (p=0.073).


In this study we explored the use of a risk prediction model in CRC screening. We developed a model that calculates the probability of advanced neoplasia for men and women aged 50 years to 75 years. Our study shows that combining risk stratification with the FIT result has better accuracy than FIT-only screening, with better discrimination, better sensitivity at similar specificity levels, and more cases of advanced adenoma detected with a similar number of colonoscopies.

Our study was embedded in the colonoscopy arm of the COCOS trial, a randomised controlled trial in which CT colonography was compared with colonoscopy in population screening for CRC. Advanced neoplasia was chosen as an endpoint of this study because the presence of advanced neoplasia represents patients that have CRC or are likely to develop malignancy, which is useful in screening strategies. An important strength of the study is that the algorithm was developed in a cross-sectional study in which all participants underwent a colonoscopic examination. Screening participants were randomly selected from the general Dutch population. We chose to include variables in the model that are easy to assess using only a simple questionnaire, to make the model readily usable in a population screening setting. A limitation of our study is the low response rate for colonoscopy in our group; participation rates in primary colonoscopy screening are generally suboptimal.21 Participation rates in the trial did not differ between men and women, but participation was lower in advanced age groups.11

Two other prediction models for the prediction of CRC have been developed previously. Freedman et al22 developed separate risk models for men and women. Their model was validated by Park et al23 The risk factors used in our model are different from those in the Freedman22 model. We did not include the status of previous sigmoidoscopies and colonoscopies, neither did we include aspirin use or oestrogen status. In contrast with the study of Freedman et al we also assessed dietary characteristics, and calcium was included in the model. Another risk model for the prediction of CRC was developed by Driver et al,24 who evaluated risk in men only.

We included family history in our risk model. A positive family history for CRC is an indication for surveillance-colonoscopy,25 but regrettably the general population is not always aware of their familiar risk and might participate in an organised screening programme using, for example, FIT. Ideally, these individuals are detected within screening programmes through risk stratification and offered colonoscopy and/or genetic counselling, irrespective of the result of their screening test.

In most countries, screening for CRC is offered to participants in the age group between 50 years and 75 years of age. Screening is not offered to younger or older participants, because the benefits in these age groups do not outweigh the harms. One could argue that it is not so much age that determines the benefits but the risk of developing preclinical and treatable cancer. Fertlitz et al26 showed that prevalence of advanced adenomas is comparable between men aged 45–49 years and women aged 55–59 years. Indeed age is an important risk factor for developing advanced neoplasia, but a broader assessment of risks should be taken into consideration for deciding which individuals should be screened for CRC. The risk of advanced neoplasia varies with age, but it is also affected by other factors, as is shown in our study. A recent study suggested that men should be screened earlier in life than women.27 In our study more men than women had advanced neoplasia, but the difference was not significant, despite the fact that we adjusted for age. For this reason sex was not included in the multivariable model.

In our analyses, we used a threshold of 50 ng/mL, since this was the anticipated cut-off for the Dutch screening programme. The national Dutch population screening programme will use a different threshold. Selecting this threshold is based on considerations on the number of participants with a positive FIT result who will be offered colonoscopy, and the relative proportion with advanced adenoma in this subgroup.

Future studies should evaluate the practical implications of preselection with a risk algorithm, with a focus on costs and participation rate. With the risk algorithm we need to gather additional information from screening participants, compared with what is currently used in the simply designed programmes based on age. The extra costs of adding a small questionnaire are probably low, but the additional questions could affect participation. Possibly the use of electronic records could give extra information.

In conclusion, risk stratification can be used as a tool to improve the effectiveness of screening by replacing age for risk as a threshold to inviting individuals for screening. Before we do that, our results need to be confirmed in additional validation studies, and, if successful, the implications of actually using this promising prediction model in preselection for colonoscopy should be addressed.


View Abstract


  • Contributors PF, EJK, ED, MEvL, MvB and PMB developed the idea for the study, designed the study and protocol, were responsible for project organisation, supervision and obtained funding. RAK was involved in the study design and obtained funding. TRdW, EMS and IS were involved in further developing the idea, collection of the data and writing the manuscript. IS was responsible for the statistical analyses of the data and was principally responsible for drafting this manuscript. All authors contributed to the final manuscript through critical revision and correction of draft versions, and they approved the final manuscript.

  • Funding The study was funded by The Netherlands Organization for Health Research and Development of the Dutch Ministry of Health (ZonMW 120720012 and ZonMW 121010007) and by the Center for Translational Molecular Medicine (CTMM DeCoDe-project). The sponsor was not involved in the study.

  • Competing interests None.

  • Ethics approval Medical Ethical Committee AMC.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.