Article Text

Download PDFPDF

Original article
A validated tool to predict colorectal neoplasia and inform screening choice for asymptomatic subjects
  1. Martin C S Wong1,2,
  2. Thomas Y T Lam1,
  3. Kelvin K F Tsoi1,
  4. Hoyee W Hirai1,
  5. Victor C W Chan1,
  6. Jessica Y L Ching1,
  7. Francis K L Chan1,
  8. Joseph J Y Sung1
  1. 1Institute of Digestive Disease, Chinese University of Hong Kong, Shatin, Hong Kong
  2. 2School of Public Health and Primary Care, Chinese University of Hong Kong, Shatin, Hong Kong
  1. Correspondence to Professor Joseph J Y Sung, Institute of Digestive Disease, Chinese University of Hong Kong, Shatin, NT, Hong Kong; jjysung{at}


Objective We aim to develop and validate a clinical scoring system to predict the risks of colorectal neoplasia to better inform screening participants and facilitate their screening test choice.

Design We recruited 5220 Chinese asymptomatic screening participants who underwent colonoscopy in Hong Kong during 2008–2012. From random sampling of 2000 participants, independent risk factors were evaluated for colorectal neoplasia, defined as adenoma, advanced neoplasia, colorectal cancer or any combination thereof using binary regression analysis. The ORs for significant risk factors were used to develop a scoring system ranging from 0 to 6: 0–2 ‘average risk’ (AR) and 3–6 ‘high risk’ (HR). The other 3220 screening participants prospectively enrolled between 2008 and 2012 for screening colonoscopy formed an independent validation cohort. The performance of the scoring system for predicting colorectal neoplasia was evaluated.

Results The prevalence of colorectal neoplasia in the derivation and validation cohorts was 31.4% and 30.8%, respectively. Using the scoring system developed, 78.9% and 21.1% in the validation cohort were classified as AR and HR, respectively. The prevalence of colorectal neoplasia in the AR and HR groups was 27.1% and 44.6%, respectively. The subjects in the HR group had 1.65-fold (95% CI 1.49 to 1.83) increased prevalence of colorectal neoplasia than the AR group.

Conclusions The scoring system based on age, gender, smoking, family history, Body Mass Index and self-reported diabetes is useful in predicting the risk of colorectal neoplasia.

View Full Text

Statistics from

Significance of study

What is already known about this subject

  • Colorectal Cancer screening is recommended for asymptomatic subjects aged 50 years or above.

  • Faecal occult blood testing and colonoscopy are two commonly used screening modalities; and failure to provide screening participants with their preferred tests may contribute to non-compliance and programme failure.

  • The choice of screening test is influenced by information on the risk of having adenoma or cancer, and the perception of there being a high risk is strongly associated with choosing a colonoscopy.

  • Presently, there is no validated instrument which provide screening participants with their risks of having a colorectal neoplasia, including adenoma.

What are the new findings

  • A risk stratification score may be used to prioritise high risk individuals for colonoscopy and polypectomy.

  • Risk stratification based on age, gender, smoking status, family history, Body Mass Index, and self-reported diabetes predicts the risk of colorectal neoplasia in asymptomatic persons.

  • Based on a scoring system using age, gender, smoking status, family history, body mass index and diabetes, the high risk individuals have a 2.37-fold increased chance of finding advanced neoplasia compared to average risk subjects.

How might it impact on clinical practice in the foreseeable future?

  • The prediction tool provides information on the individual risk of colorectal neoplasia on top of advanced neoplasia, as predicted by other scoring systems like the Asia Pacific Colorectal Screening scores.

  • The scoring instrument is easy to use in clinical practice, and could facilitate informed choices on screening modalities.

  • Higher-risk subjects could be recommended to choose a colonoscopy which could detect and remove adenomas, whereas lower-risk subjects could consider faecal tests.


Faecal occult blood tests (FOBT) and colonoscopy are two common screening tests for colorectal cancer (CRC). Both have been shown to be effective to reduce CRC mortality,1–4 either by detection of cancers at an early, curable stage or by removal of adenomas.5 From the National Polyp Study (NPS), colonoscopic polypectomy reduces the incidence of CRC6 and also its mortality rate according to a recent prospective study from the same research group.4 A guideline jointly produced by the American Cancer Society, the US Multi-Society Task Force on CRC, and the American College of Radiology recommended that CRC prevention by polyp detection and removal should be the primary goal of screening.5 The guideline also suggested that clinicians should make patients aware of the full range of screening options, followed by patients’ making an informed choice.5

We have previously studied the preference of receiving FOBT versus colonoscopy among asymptomatic subjects in a community screening centre7 and the factors associated with their change in choice between these two screening tests.8 It was found that participant demography and health beliefs had a strong influence on their choice of CRC screening method. In addition, the self-perception of the risks for CRC was demonstrated to be a significant factor influencing a change in their choice between FOBT and colonoscopy, when a change was allowed.7 ,8 Screening participants who changed their choice of screening tool, or who regretted not having chosen the most preferred tests, were less likely to be compliant with the screening programme over time. Compliance with tests is one of the most crucial components for the success of population-based CRC screening programmes,9 and failure to provide screening participants with their preferred tests may contribute to non-compliance and potential programme failure.9 ,10

However, it is widely recognised that an individual's risk of having adenoma or cancer could significantly influence their test choice, and their perception of the risks is strongly associated with choosing colonoscopy. Currently, the prediction rules exist only for estimating colorectal advanced neoplasia11–14 but not for instruments which could accurately predict the risk of premalignant lesions, including adenoma. There are a few important justifications for developing a scoring system to predict non-advanced colorectal neoplasia. For instance, people who have adenomas are at an increased risk for developing metachronous adenomas or cancer compared with those without adenoma, and there is evidence that detection and removal of adenomas can prevent cancers and reduce mortality.4 This advantage confers the benefits of using colonoscopy over faecal tests; but unless the screening participants are made aware of the risks of adenoma, this additional benefit of colonoscopy could not be quantified. A risk stratification system is therefore required in clinical practice. Also, colorectal adenoma alone as a cancer precursor lesion might not be tolerant to some screening participants, especially to ethnic Chinese.

The objective of this study was to develop and validate a clinical risk stratification score predicting the risk of colorectal neoplasia among asymptomatic subjects aged 50–70 years. We aim to construct a simple tool for clinicians so that the risks associated with the screening of participants having colorectal neoplasia could be easily computed to inform shared decision making, and thus facilitating their choice of screening method.



A community CRC screening centre was established in 2008 in Hong Kong. It provided free CRC screening for all Hong Kong residents aged 50–70 years who were asymptomatic of CRC via media invitation. A detailed description of this centre has been published elsewhere.7 ,8 ,15 This study was approved by the Clinical Research Ethics Committee of the Chinese University of Hong Kong.

Study participants

The screening participants consist of self-referred subjects who registered for the programme via online application, telephone, e-mail, fax or walk-in. The eligibility criteria for screening include: (1) the participants’ age being 50–70 years; (2) the absence of existing or previous symptoms suggestive of CRC, such as haematochezia, malena, anorexia or a change in bowel habit in the past 4 weeks, or a weight loss of greater than 5 kg in the past 6 months; and (3) not having received any CRC screening tests in the past 5 years. Subjects with personal history of CRC, colonic adenoma, diverticular disease, inflammatory bowel disease, prosthetic heart valve or vascular graft surgery were excluded. Participants with medical conditions which were contraindications for colonoscopy, like cardiopulmonary insufficiency and the use of double antiplatelets, were also excluded. The eligibility of each participant and the exclusion criteria were checked by trained staff in the centre.

Registered participants were invited to fill in a self-administered questionnaire, which included information on their age, gender, family history of CRC, smoking status, drinking habits, past medical history and long-term medication usage. Meanwhile, centre staff checked for the completeness of questionnaires while trained volunteers assisted with survey completion for illiterate participants.

Each participant subsequently joined a health seminar which described the benefits and risks of faecal immunochemical tests (FIT) and colonoscopy in a non-preferential manner. They were offered a choice between receiving FIT yearly for up to 5 years, or one direct colonoscopy. The present study included all screening participants who have chosen colonoscopy, those who have chosen FIT but underwent colonoscopy due to a positive stool specimen, and participants who received a colonoscopy after three consecutive years of negative FIT results.

The derivation and validation cohort

A total of 5220 screening participants received colonoscopy in the study period. Among them, we performed a simple random sampling to select 2000 subjects as our derivation cohort. Each study participant represents one unit of randomisation and has an equal probability of being selected. The prevalence of colorectal neoplasia was 31.4% in the derivation set, and we assumed a point prevalence of individual risk factors being 25%, as in the Asia Pacific Colorectal Screening (APCS) study.11 Based on these assumptions, a minimum of 2800 subjects from the validation cohort were required to achieve a power of 80% to detect a risk factor with an OR of 2 at a significance level of p<0.05. Therefore, we included all other 3220 subjects in other 3220 subjects formed our validation cohort.

Development of the risk scores

The association between the colonoscopic finding of colorectal neoplasia and each risk factor was examined by Pearson χ2 tests in the derivation cohort. The risk factors examined included age, gender, family history of CRC, smoking, drinking (current drinkers of alcohol for more than two times per week vs those drinking less or non-drinkers), Body Mass Index (BMI), self-reported medical conditions, use of non-steroidal anti-inflammatory agents (NSAIDs) and aspirin. Any variable with p<0.15 in univariate analysis was included in a binary logistic regression model with colorectal neoplasia as the outcome. Each risk factor was assigned a weighting in the risk score using the respective adjusted OR (AOR) halved and rounded to the nearest integer. This is for the sake of simplicity with an aim to keeping the total score under 10. The risk score for each individual is the summation of all the risk factors. A receiver operating characteristic (ROC) curve was constructed and the area under the curve was used to evaluate the validity of the scores. A subgroup analysis was performed where those who had positive FIT (n=346) and successive negative FITs (n=286) were excluded, and an identical analysis was conducted.

Statistical analysis

All data were entered and analysed by the IBM SPSS Statistics 19.0 (IBM, Armonk, New York). The prevalence of colorectal neoplasia, according to each score in the derivation cohort was evaluated. The score with a magnitude closest to and below the overall prevalence of colorectal neoplasia was assigned a category of ‘average risk’ (AR), while scores above were categorised as ‘high risk’ (HR). Using the validation cohort, another separate binary logistic regression model was constructed by entering all the significant risk factors identified by analysis of the derivation cohort to evaluate the AOR. The AORs of each risk factor were compared between both cohorts. We adopted the Hosmer–Lemeshow goodness-of-fit statistic to assess the reliability of the final model, with p>0.05 indicating a good match of predicted risk over observed risk. C-statistics and the area under the ROC curve were used to evaluate the ability of the scoring system to predict the risk of developing colorectal neoplasia. All two-sided p values <0.05 were regarded as statistically significant.


Participant characteristics

In the derivation cohort, the average age of the participants was 57.9 years (SD 5.0) with 52.3% being male subjects (table 1). A total of 627 (31.4%) cases of colorectal neoplasia were detected, including 11 (0.6%) and 108 (5.4%) being cancerous and with advanced neoplasia, respectively. The characteristics of the validation cohort were similar to the derivation set, except age (p=0.028), BMI (p=0.003) and self-reported hypertension (p=0.031). The prevalence of colorectal neoplasia according to the risk factors is shown in table 2.

Table 1

Characteristics of patients in the derivation and validation populations

Table 2

Prevalence of colorectal neoplasia* and advanced neoplasia* in the derivation cohort by risk factors

Independent predictors of colorectal neoplasia in the derivation cohort

From binary logistic regression analysis, considering age for each 5-year stratum from 50 years onwards (AOR 1.4–2.4), male gender (AOR 1.6, 95% CI 1.3 to 2.0), a positive family history in a first-degree relative (AOR 1.4, 95% CI 1.1 to 1.9), smoking (AOR 1.6, 95% CI 1.1 to 2.2), BMI ≥25 (AOR 1.5, 95% CI 1.2 to 1.8) and diabetes (AOR 1.6, 95% CI 1.1 to 2.2), were significantly associated with colorectal neoplasia (table 3). Alcohol use and hypertension were statistically significant in univariate analysis but not in the multivariate regression model.

Table 3

Univariate and multivariate predictors of colorectal neoplasia* in the derivation cohort

Development of the risk score

According to the AORs from the derivation cohort, the following variables were used to assign scores to each screening participant (table 4): age 50–55 years (0), 56–70 years (1), male gender (1), female gender (0), family history of CRC in a first-degree relative present (1) or absent (0), current or ex-smoker (1), non-smoker (0), BMI <25 kg/m2 (0), BMI ≥25 kg/m2 (1), diabetes (1), no diabetes (0). The scoring system ranges from 0–6, and a subject's score was based on the sum of all the points allocated to each individual risk factor. The number of subjects having different scores is shown in table 5. Since a score of 2 has a prevalence of colorectal neoplasia closest to the overall prevalence in the derivation cohort (32.6% vs 31.4%), a scoring of ≤2 was designated as ‘AR’. Scores at 3 or above had prevalence higher than the overall prevalence, and hence were assigned as ‘HR’. From this stratification, 78.4% of the derivation cohort was AR and 21.6% HR (table 6). In the derivation cohort, there were 11 CRCs, in which 4 were categorised into AR and 7 into HR; in the validation cohort there were 13 CRCs, with 8 patients and 5 patients being classified as AR and HR, respectively.

Table 4

Colorectal screening score for prediction of risk for colorectal neoplasia*

Table 5

Distribution of number of subjects for each score category in the derivation cohort

Table 6

Prevalence of colorectal neoplasia and colorectal advanced neoplasia by risk tier

Validity and reliability of the model

From the validation cohort, 78.9% was in the AR and 21.1% in the HR tiers. These proportions were similar to subjects in the derivation cohort (table 6). The prevalence of colorectal neoplasia in the AR and HR groups of the validation cohort was 27.1% (95% CI 25.35% to 28.84%) and 44.6% (95% CI 40.87% to 48.87%), respectively (table 6). The c-statistic for the risk score in the derivation and validation cohort was 0.62 ± 0.01 and 0.62 ± 0.01, respectively. When compared with participants in the AR group, subjects in the HR group had a significantly higher risk of colorectal neoplasia (AOR 1.65, 95% CI 1.49 to 1.83) and advanced neoplasia (AOR 2.37, 95% CI 1.74 to 3.23). The Hosemer–Lemeshow goodness-of-fit statistic evaluating the reliability of the validation set had a p value >0.05, implying a close match between predicted risk and real risk. A subgroup analysis where those who had positive FITs or three successive negative FIT were excluded, showed that the scoring system remains the same, and the distribution of the risk tiers was similar to the original one.


Major findings and implications in clinical practice

This study has developed and validated a simple clinical risk scoring system for predicting colorectal neoplasia in asymptomatic subjects. In response to recommendations from the Institute of Medicine,16 and the US Preventive Services Task Force,17 primary care practices are expected to promote informed decision making, so patients are informed of the risks and benefits of screening.18 This can facilitate shared decision making, and patients can participate in making decisions to the extent that they so desire.17 The Institute of Medicine considers these practices to be part of patient-centred care, that is, providing care that is respectful of, and responsive to, individual patient preferences, needs and values, and ensuring that patient values guide all clinical decisions.16 Also, this risk prediction approach developed in clinics and the healthcare system addresses how best the Government and policy makers may offer and implement CRC screening when endoscopic resources are limited.

This risk stratification index is easy to use by clinicians, nurse educators, other primary care providers and prospective screening participants. The information required to estimate risk in this system is also user-friendly in community settings. The unique contribution of this scoring system includes the provision of information on individual risks for patients, which allows for an informed choice on the selection of screening modalities. For instance, higher-risk subjects could be recommended to choose screening tools which primarily identify and remove neoplastic lesions (eg, colonoscopy), whereas lower-risk subjects could consider screening modalities which aim for diagnosis (eg, faecal tests). This risk stratification may improve colonoscopic yield and optimise the cost effectiveness of screening. The knowledge of one's individual risks for CRC has been shown to affect one's screening behaviour over time.19 Hence, the use of this tool in clinical consultations could also facilitate discussion between physicians and screening participants, potentially enhancing the awareness of risks, screening uptake and compliance.8 ,9 These findings could also contribute to policy making at the macro level; when the characteristics of residents who are eligible to join population-based screening programmes are made known, resources to equip colonoscopy capacity could be more accurately estimated.

Relationship with literature—prevalence of colorectal neoplasia and scoring systems

A prevalence study in Israel conducted among 1177 asymptomatic subjects without a family history of CRC found that the prevalence of colorectal neoplasia was 20.9%.20 A large-scale study in Indiana performed among 1994 asymptomatic screenees aged 50 years or older showed that the prevalence of overall adenoma was 17.7%.21 Another study including 3121 asymptomatic individuals in 13 Veterans Affairs medical centres situated in the major regions of the USA reported a prevalence of 36.5%.22 It should be noted that up to 96.8% of these US patients were men. A prospective multinational, multicentre colonoscopy survey among 860 asymptomatic individuals in 11 Asian cities reported a prevalence of 18.5%,23 and the APCS study conducted in 1892 asymptomatic persons found a prevalence of 18.7%.11 Hence, the prevalence of colorectal neoplasia as found in our derivation (31.4%) and validation (30.8%) cohort was relatively high. This may be due to the fact that our patients were older (58 years vs 51–54 years in the Asian studies 11 ,23) and a relatively high proportion of them had diabetes. Additionally, the prevalence of advanced neoplasia was 4.4% and 7.9% in the moderate and HR groups, respectively, in the APCS study.11 In this study, the prevalence was 4.4% in the AR group and was slightly higher in the HR group (9.0%) than that in the APCS study.

The present study has constructed a scoring system based on age,12 ,13 ,23 gender,12 ,13 family history,12 smoking,24–28 BMI29 and self-reported diabetes.30 These have been widely recognised as risk factors for colorectal neoplasia. Nevertheless, unlike previous scoring systems which used advanced neoplasia as the outcome,12 ,13 ,14 ,24 ,31 we have also included adenoma in our analysis. Risk-scoring systems focusing on advanced neoplasia as the outcome exert impacts on a policy-level by prioritising HR subjects for colonoscopy and AR subjects for faecal tests. Our scoring system extends this by including adenoma and is also relevant to clinical practice. Informing screening participants on their combined risk of adenoma, advanced neoplasia and CRC could be of significant interest to some patients. The NPS showed that a patient maintained ‘adenoma-free’ could be kept ‘cancer free’,6 and some patients might wish to have their adenomas removed even before advanced neoplasia is developed. This relates to individual tolerability on different colorectal pathology which merits further studies. It should be highlighted that there still exists a modest risk for those average-risk individuals to have advanced adenoma (2.8–4.9%) so patients can choose the screening modality according to their risk threshold. Additionally, according to the present system, 21% of high-risk screening participants will be classified as HR. We therefore recommend that this scoring system should be used with care bearing in mind the importance of efficient use of limited colonoscopy services; we also suggest that the tool aims to inform participants on screening choice—but mostly among those who wish to have their risk estimation for colorectal neoplasia in addition to advanced neoplasia.

Study strengths and limitations

This study has included a large number of asymptomatic subjects and used a standardised methodology to devise a simple, easy-to-use instrument to stratify risks of colorectal neoplasia. However, a few limitations should be addressed. First, we included a relatively homogenous population of self-referred screening participants in the derivation and validation cohorts, and their age range was 50–70 years. This might potentially limit the generalisability of the findings to other population groups with different ethnicities, and it ought to be noted that the scoring system is not applicable to subjects outside this age range. Additionally, we have not included all the potential risk factors suspected to be associated with colorectal neoplasia, like dietary intake of red meat, saturated fat, fibre,32 ,33 physical activities,34 and waist circumference which has recently been found to be a more accurate predictor than BMI.35 Abdominal obesity was found to be associated with an increased risk of adenoma even after adjusting for BMI, but not vice versa. Last, in some busy clinical practice, BMI might not be easy to obtain due to limited consultation time, and ascertainment of family history might be less reliable than other independent predictors. We are of the view, however, that when a patient has agreed to undergo CRC screening but is uncertain which screening tool to use, the scoring system will substantially assist patient choice. This tool is, therefore, particularly suited for patients who are keen to obtain more comprehensive information about their risks.

In summary, we have developed and constructed a validated clinical score for the prediction of colorectal neoplasia in a Chinese population. Future studies should evaluate the scoring system in other countries with a different prevalence of colorectal neoplasia, and assess its acceptability, feasibility and the cost effectiveness of its use in clinical practice and community settings.


We would like to thank all the screening participants who joined the study.


View Abstract


  • Contributors MCSW participated in design of the study, analysis of the results and writing of the first draft of the manuscript; TYLL and KKFT participated in the design of the study and analysis of the results; HWH, VCWC and JYLC conducted the study and analysis of the results; FKLC and JJYS participated in design and performance of the study, analysis and discussion. All authors read and approved the final manuscript. All authors included in the paper fulfil the criteria of authorship. There is no one else who fulfils the criteria but has not been included as an author.

  • Funding The study was supported by a grant from the Hong Kong Jockey Club Charities Trust for the Chief Executive Project ‘Bowel Cancer in Hong Kong: Education, Promotion and Screening’.

  • Competing interests None.

  • Ethics approval Clinical Research Ethics Committee, Chinese University of Hong Kong.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.