Costeffectiveness of one versus two sample faecal immunochemical testing for colorectal cancer screening
 S Lucas Goede1,
 Aafke H C van Roon2,
 Jacqueline C I Y Reijerink3,
 Anneke J van Vuuren2,
 Iris LansdorpVogelaar1,
 J Dik F Habbema1,
 Ernst J Kuipers2,4,
 Monique E van Leerdam2,
 Marjolein van Ballegooijen1
 ^{1}Department of Public Health, Erasmus University Medical Centre, Rotterdam, The Netherlands
 ^{2}Department of Gastroenterology and Hepatology, Rotterdam, The Netherlands
 ^{3}Association of Nationwide Screening Southwestern Netherlands, Vlaardingen, The Netherlands
 ^{4}Internal Medicine, Erasmus MC University Medical Centre, Rotterdam, The Netherlands
 Correspondence to S L Goede, Department of Public Health, Erasmus MC, University Medical Centre Rotterdam, PO Box 2040, Rotterdam 3000 CA, The Netherlands; s.goede{at}erasmusmc.nl

Contributors EJK, JDFH, MvB, and MEvL conceived the idea for the study; MvB and ILV supervised the model simulations and data analysis; SLG drafted the report; AHCvR provided critical review of the report. All coauthors listed above were given an opportunity to comment on the paper.
 Revised 17 February 2012
 Accepted 18 February 2012
 Published Online First 5 April 2012
Abstract
Objective The sensitivity and specificity of a single faecal immunochemical test (FIT) are limited. The performance of FIT screening can be improved by increasing the screening frequency or by providing more than one sample in each screening round. This study aimed to evaluate if twosample FIT screening is costeffective compared with onesample FIT.
Design The MISCAN–colon microsimulation model was used to estimate costs and benefits of strategies with either one or twosample FIT screening. The FIT cutoff level varied between 50 and 200 ng haemoglobin/ml, and the screening schedule was varied with respect to age range and interval. In addition, different definitions for positivity of the twosample FIT were considered: at least one positive sample, two positive samples, or the mean of both samples being positive.
Results Within an exemplary screening strategy, biennial FIT from the age of 55–75 years, onesample FIT provided 76.0–97.0 lifeyears gained (LYG) per 1000 individuals, at a cost of €259 000–264 000 (range reflects different FIT cutoff levels). Twosample FIT screening with at least one sample being positive provided 7.3–12.4 additional LYG compared with onesample FIT at an extra cost of €50 000–59 000. However, when all screening intervals and age ranges were considered, intensifying screening with onesample FIT provided equal or more LYG at lower costs compared with twosample FIT.
Conclusion If attendance to screening does not differ between strategies it is recommended to increase the number of screening rounds with onesample FIT screening, before considering increasing the number of FIT samples provided per screening round.
 Colorectal cancer screening
 costeffectiveness
 costeffectiveness analysis
 decision analysis
 faecal immunochemical test
 gastric cancer
 gastritis
 Helicobacter pylori
 multiple samples
 populationbased colorectal cancer screening
 screening
Significance of this study
What is already known on this subject?

Twosample FIT screening with referral for colonoscopy if at least one sample is positive provides a higher detection rate for advanced neoplasia than onesample FIT screening.

However, this is at the expense of higher positivity rates and thus the need for more colonoscopies.
What are the new findings?

Within a given screening age range and interval, twosample FIT screening provides additional LYG compared with onesample FIT screening at acceptable costs.

Intensifying screening with onesample FIT provides equal or more LYG at lower costs, compared with screening by means of twosample FIT.
How might it impact on clinical practice in the foreseeable future?

In order to improve the effectiveness of their CRC screening programme, decisionmakers are recommended to increase the number of screening rounds with onesample FIT screening, before considering increasing the number of FIT samples provided per screening round.
In industrialised countries colorectal cancer (CRC) is the third most commonly diagnosed malignancy in men and ranks second in women.1 The majority of CRC cases are diagnosed later in life. Because life expectancy increases in many countries and the costs of CRC treatment rise rapidly, it is expected that CRC will place an increasing burden on national healthcare systems.
Screening for CRC and its premalignant lesions (ie, adenomatous polyps) can detect the disease at an earlier and more curable stage. Faecal occult blood tests (FOBT) have been developed to detect microscopic bleeding from colorectal neoplasms before there are any clinical signs or symptoms. At least three randomised controlled trials have proved the effectiveness FOBT screening, demonstrating a mortality reduction of 15–33%.2–4 Subsequently, several screening trials have confirmed the superiority of faecal immunochemical test (FIT) screening over the more traditionally used guaiacbased FOBT (ie, nonrehydrated HemoccultII test) both with respect to attendance as well as the detection rate of advanced neoplasia.5–11 Most of these trials used screening strategies with a single FIT sample.
As not all advanced neoplasia will be detected by means of onesample FIT screening, providing two FIT samples collected on consecutive days could increase the effectiveness of a screening programme. On the one hand, referring a screenee for a diagnostic colonoscopy when at least one sample is positive increases sensitivity because some colorectal neoplasms bleed intermittently and can therefore be missed with onesample FIT screening.12 On the other hand, referring a screenee when both samples are positive can increase specificity because only colonic lesions with a more consistent bleeding pattern will be detected, which will lead to fewer falsepositive test results. However, either way, providing two FIT samples within one screening round will also increase screening costs because twice the number of samples need to be analysed.
The aim of this study was to evaluate the costeffectiveness of onesample and twosample FIT screening strategies with variable intervals, age ranges and cutoff levels in order to assess whether the increased performance of a second FIT sample outweighs the increased costs compared with onesample FIT screening.
Materials and methods
We used the MISCAN–colon microsimulation model to estimate the additional lifeyears gained (LYG) and costs of twosample FIT screening over onesample FIT for the screening strategy of biennial FIT from the age 55 to 75 years. This screening strategy has intermediate screening intensity and was previously found to be costeffective.13 Additional LYG can also be achieved by increasing the intensity of onesample FIT screening instead of adding a second sample. We therefore also compared the costs and LYG of onesample FIT screening with that of twosample FIT for a range of screening strategies.
MISCAN–colon microsimulation model
The MISCAN–colon model and the data sources that inform the quantifications of the model are described in detail in supplementary appendix 1, in previous publications,14–18 and in a standardised model profile available online only.19 In brief, the MISCAN–colon model simulates the relevant life histories of a large population of individuals from birth to death. CRC arises in this population according to the adenoma–carcinoma sequence.20 ,21 More than one adenoma can occur in an individual and each adenoma can independently develop into a CRC. Adenomas progress in size from small (≤5 mm) to medium (6–9 mm) to large (≥10 mm). Although most adenomas will never turn into cancer, some will eventually become malignant, transforming to stage I CRC and some may even progress into stage IV. In every stage, there is a probability of the CRC being diagnosed due to the development of symptoms versus symptomless progressing into the next stage. If CRC has developed, the survival rate after clinical diagnosis depends on the stage in which the cancer was detected. The 5year survival rate is on average 90% if the disease is diagnosed while still localised, 68% for regional disease, and less than 10% for disseminated disease. At any time during the development of the disease, the process may be interrupted because a person dies of other causes.
With FIT screening lesions can be detected before clinical diagnosis; a screened individual with a positive test result will be referred for a colonoscopy for the detection and removal of adenomas and earlystage cancers. In this way, CRC incidence and/or CRCrelated mortality can be reduced. The LYG by screening are calculated as the difference in modelpredicted life years lived in the population with and without CRC screening.
Study population
In this study we modelled the age distribution of the Dutch population in 200522 and all individuals were followed until death. The CRC incidence rate was based on the observed incidence rate in The Netherlands in 1999–2003, which was before the onset of opportunistic screening.23 The observed CRC incidence in the population included cases from higher risk groups. Survival rates after clinical diagnosis of CRC was based on relative survival data from 1985 to 2004 from the south of The Netherlands,24 since nationwide data were not available. The survival for individuals aged 75 years and older was adjusted to fit the observed ageincreasing mortality/incidence ratio.23
Screening strategies
CRC screening was simulated in the population starting in 2010. Individuals were offered FIT screening according to different screening schedules varying by:

Age to start screening at, respectively, 45, 50, 55 and 60 years

Age to stop screening at, respectively, 70, 75 and 80 years

Screening interval with, respectively, 1, 1.5, 2 and 3 years
Separate simulations were performed in which individuals were invited for: onesample FIT screening; twosample FIT screening with referral if at least one sample tested positive; twosample FIT screening with referral only if both samples tested positive; or twosample FIT screening with referral if the mean of both samples was positive. The cutoff level for a positive test result varied between 50, 75, 100, 150 and 200 ng haemoglobin/ml. These different screening schedules with varying start and stop ages, intervals, cutoff levels and samples resulted in a total of 960 different screening strategies.
After a positive test result, individuals were referred for colonoscopy. If no adenomas were found during the procedure, the individual was assumed to be at low risk of CRC and did not return to the screening programme until after 10 years. If one or more adenomas were found, they were removed and the individual entered a surveillance programme according to the Dutch guidelines for followup after polypectomy,25 ie, a colonoscopy after 6 years in the case of one or two adenomas and after 3 years in the case of three or more adenomas. We assumed that surveillance colonoscopies would be performed until the stop age for screening.
Attendance rates
We modelled attendance rates in the first screening round as observed in two Dutch populationbased CRC screening trials;9 ,11 ,12 60% for both one and twosample FIT screening, and we assumed these rates would remain stable over time. For subsequent screening rounds, we assumed that 80% of the individuals who attended the previous screening round would attend again.26 ,27 Furthermore, we assumed that 10% of the individuals never attended FIT screening28 and that these neverattenders had a higher risk of CRC than the general population (RR 1.15).2 Attendance at diagnostic colonoscopies following a positive FIT and subsequent surveillance colonoscopies were assumed to be 85% and 80%, respectively.29
Test characteristics
Test characteristics of the onesample and twosample FIT tests were fitted to the positivity rates and detection rates of advanced neoplasia observed in the first screening round of two Dutch randomised trials (table 1).9–12 Advanced neoplasia included CRC and advanced adenomas, of which the latter was defined as adenomas of 10 mm or greater in size, with 25% or greater villous component, and/or highgrade dysplasia.
To estimate the twosample FIT test characteristics the following approach was applied; we used the average positivity rates and detection rates of the first and second test performed from the twosample FIT group as reference and calculated the relative difference in performance when both samples were evaluated. Subsequently, we added this relative difference to the positivity rates and detection rates derived from the original onesample FIT trials. An example of this method of calculation is presented in figure 1. The main reasons for this approach were: (1) the larger sample size of the onesample FIT group provides more statistical power for the estimates of test sensitivity and specificity; (2) to avoid possible bias caused by the fact that the positivity rates and detection rates of the onesample and twosample FIT groups were calculated from different cohorts that were not 1:1 randomly assigned before invitation;10 ,12 (3) in this way we used paired observations, which gives a better estimate of the additional performance of a second FIT sample.
The sensitivity of diagnostic colonoscopies was assumed to be 75% for adenomas 1–5 mm, 85% for adenomas 6–9 mm, and 95% for adenomas 10 mm or greater and CRC.30
Costs
The analysis was conducted from a healthcare system perspective. In the base case analyses, we included screening and treatment costs as presented in table 2. Base case organisational costs for onesample FIT screening were based on the Dutch cervical cancer screening programme, adjusted for differences with FIT screening. Costs for the test kits were based on prices from the manufacturer. Costs for analysis of the tests included material and personnel needed during the process of registration, analysis and authorisation of returned tests.34 The additional costs associated with twosample FIT screening included double costs for FIT test kits and packaging material, and double costs for materials needed during the analysis of returned samples. Although double the number of FIT samples would need to be analysed, the costs of personnel needed for the analysis only increased by a factor of 1.5 because some tasks (eg, patient registration) do not require double the amount of work compared with analysing samples with onesample FIT screening. Colonoscopy costs were based on an internal 6 months study at the Erasmus MC (data not shown). Costs for complications after colonoscopy were based on diagnosis treatment combination (DTC) rates derived from the Dutch Health Care Authority.35
Costs for treatment of CRC were divided into three clinically relevant phases of care: initial treatment, continuous care and terminal care. Initial treatment costs were based on DTC rates, except for oxaliplatin. The costs for oxaliplatin were derived from the Dutch Health Care Insurance Board.36 We assumed that during the continuous care phase, individuals would follow the Dutch CRC treatment guidelines,37 and costs for periodic control were based on DTC rates. Terminal care costs were based on a Dutch last year of life cost analysis. These were estimated at €19 700 for patients who ultimately died from CRC.38 We assumed that these costs increased with stage at diagnosis, at a rate observed for US patients.39 ,40 Dutch terminal care costs for individuals who died from CRC were approximately 40% of the US costs. We assumed that terminal care costs of CRC patients who die from other causes were also 40% of the US costs.
Costeffectiveness analyses
For all screening strategies we used the MISCAN–colon model to estimate costs and compare the number of LYG due to screening with the situation without screening. Costs and LYG were discounted by 3% per year.41 Strategies that were more costly and less effective than other strategies were ruled out by simple dominance. Strategies that were more costly and less effective than a mix of other strategies were ruled out by extended dominance. The remaining strategies are not dominated and are known as ‘efficient’. On a plot of LYG versus costs, the line that connects the efficient strategies is called the efficient frontier, which implies that all dominated strategies lie below this line. The incremental costeffectiveness ratio (ICER) of an efficient strategy was determined by comparing its additional costs and effects with those of the next less costly and less effective efficient strategy.
Sensitivity analyses
We performed several sensitivity analyses on different parameters, which are summarised in table 2. We started with sensitivity analyses with respect to the additional performance and costs of twosample FIT over onesample FIT. Furthermore, we adjusted for reduced quality of life due to screening as well as CRC treatment. Correlated FIT test results were assumed because individuals with a falsenegative test result are likely to have a higher than average probability to have another falsenegative test result at a successive screening round. We used the results of a populationbased CRC screening programme in Italy to estimate the correlation between falsenegative FIT results for cancers and advanced adenomas in subsequent screening rounds.33 Effects of limited colonoscopy capacity were evaluated by only considering strategies in which colonoscopy demand did not exceed 40, 20, 10, or five colonoscopies per 1000 individuals per year. In order to assess the costeffectiveness of the different strategies for individuals who adhere to the CRC screening guidelines, we simulated all screening strategies with 100% attendance to screening, diagnostic and surveillance colonoscopies. In addition, we performed sensitivity analyses on lower and higher values than the base case analysis for fatal complication rates with colonoscopy and for unit costs of FIT, colonoscopy, complications and treatment. We decided not to perform a probabilistic sensitivity analysis after having weighed the limited added value against the computational effort required (see Discussion).
Results
The strategy of biennial onesample FIT screening from age 55 to 75 years yielded 76.0–97.0 LYG per 1000 individuals aged 45 years and older, compared with no screening (the range in LYG reflects different FIT cutoff levels). The associated costs ranged from €259 000 to €264 000 per 1000 individuals, corresponding with €2690–3473 per LYG compared with no screening (figure 2). The twosample FIT screening strategies with the mean of both test results being positive and at least one test result being positive provided, respectively, between −0.3–2.6 and 7.3–12.4 more LYG than onesample FIT screening at additional costs of, respectively, €43 000–50 000 and €50 000–59 000 per 1000 individuals. The corresponding ICER ranged from €16 818–31 930 and €4024–8041 per additional LYG. The twosample FIT screening strategies with two positive outcomes were less effective (ie, fewer LYG per 1000 individuals) and more costly than onesample FIT screening, and were therefore dominated from a costeffectiveness standpoint (see supplementary appendix 2, available online only, for detailed results on effects and costs for the different biennial FIT screening strategies with the age range of 55–75 years).
When all simulated screening strategies were considered (ie, by varying not only the cutoff level, but also the screening age range and interval), the number of LYG compared with no screening ranged between 17.5 and 153.4 per 1000 individuals, and costs ranged between €105 000 and €889 000 per 1000 individuals (figure 3). The LYG and costs of the strategies on the efficient frontier are presented in table 3. Although the ICER of biennial twosample FIT screening between ages 55 and 75 years (mean of both samples being positive, or at least one sample being positive) compared with onesample FIT seemed reasonable, table 3 shows that most twosample FIT strategies are not costeffective. When comparing the additional effect of providing two samples per screening round to the effect of providing onesample FIT more frequently (ie, with a larger age range and/or shorter interval), the latter provided more LYG at equal or less costs than the twosample FIT strategies. This effect is also demonstrated in figure 2, because the strategies of biennial twosample FIT are located below the efficient frontier. The twosample FIT screening strategies with the mean from both test results being positive or at least one positive test outcome were therefore ruled out by extended dominance and were considered not to be costeffective compared with onesample FIT screening. Although figure 2 demonstrates this effect for biennial FIT screening, the principle applies to all screening intervals, including annual screening.
Sensitivity analyses
The higher costeffectiveness of more frequent onesample FIT screening compared with twosample FIT strategies was robust to alterations in our model assumptions. However, decreasing the cost difference between onesample and twosample FIT by 50% resulted in multiple twosample FIT strategies becoming efficient next to onesample FIT. In addition, limited colonoscopy capacity did not affect the preference of onesample FIT over twosample FIT strategies, with the exception of the most stringent scenario. In case the colonoscopy demand was not allowed to exceed five colonoscopies per 1000 individuals per year, twosample FIT strategies with both samples being positive were preferred over onesample FIT.
Discussion
Our analysis demonstrates that given a screening schedule (ie, age range and screening interval), twosample FIT strategies with the mean from both test results being positive or at least one positive test outcome provide more LYG at acceptable costs than onesample FIT screening. However, when all simulated screening strategies are considered (ie, including varying age ranges and screening intervals), increasing the screening intensity of onesample FIT testing (ie, greater age range and/or shorter screening interval) is more costeffective than providing two FIT within one screening round.
This study was based on data from a randomised trial in which the attendance and diagnostic yield of one and twosample FIT were compared.12 Considering only the relation between the positivity rate and the detection rate of advanced adenomas it seems that to choose FIT screening with either one or two samples based on the available colonoscopy capacity should be recommended. However, the current analysis demonstrates that including the costs for the screening and treatment of CRC over multiple screening rounds affects the relation between one and twosample FIT. Although a number of twosample FIT screening strategies (eg, with at least one sample, or the mean of both samples being positive) are close to the costefficiency frontier, increasing the number of onesample FIT screening rounds was found to be a more costeffective way of gaining health benefits.
Other costeffectiveness analyses determining the optimal number of FIT samples are limited. Two Japanese studies compared the costs of FIT screening with either one, two or three FIT per cancer detected in a single screening round.42 ,43 In all three sampling strategies individuals were referred for diagnostic colonoscopy if at least one sample was positive. In both studies it was concluded that twosample FIT screening with at least one test being positive would be the most desirable strategy from a diagnostic accuracy and costeffectiveness stand point. A more recent French study did include multiple screening rounds in their costeffectiveness model and also evaluated the effect of different cutoff levels.44 The authors concluded that threesample FIT screening with a cutoff level of 50 ng haemoglobin/ml was the most costeffective strategy to be preferred. The results of our current analysis do agree with these studies about the added value of multiple FIT sampling within a given screening schedule. More than one FIT sample can provide additional health benefits at acceptable costs. Unfortunately, these studies do not provide information comparing the added effect of multiple FIT samples per screening round with the effect of increasing screening intensity with onesample FIT.
Several limitations need to be acknowledged. First, we based our analysis on data from one screening round. Therefore, we could not estimate the correlation of test outcomes between successive screening rounds. Individuals with a falsenegative test result (eg, because the lesion did not bleed) in one screening round may have a higher than average probability to have another falsenegative test result at a successive screening round. Therefore, we performed a sensitivity analysis based on Italian results,31 in which correlation of systematic falsenegative test outcomes was assumed for advanced adenomas and CRC. The analysis showed that the costeffectiveness of twosample FIT decreased less than the costeffectiveness of onesample FIT strategies, but onesample FIT screening remained dominant. Nevertheless, we need further data from repeat screening rounds in The Netherlands to get a good estimate of systematic falsenegative rates in the population we modelled. Second, we assumed the screening attendance rate to be independent of screening intensity and the number of FIT samples performed. In the first screening round of one of the Dutch trials,10–12 the screening attendance rate was not significantly different between the twosample FIT and onesample FIT study arm (61.3% vs 61.5%; p=0.837). However, it could be hypothesised that, for example, adherence in the case of a more intense screening schedule with onesample FIT would decrease compared with less intense screening schedules with twosample screening. This would negatively affect the costeffectiveness of more intensive screening strategies relative to twosample testing and might alter our conclusions. Third, we based our analyses on a screeningnaive population. Depending on the amount of previous screening, CRC incidence in the population and the resulting costeffectiveness could be lower. However, this would affect the strategies we compared in a similar way. If anything, the effect of previous screening would make onesample FIT screening more preferable, because a lower CRC incidence would reduce the added value of a second FIT sample. Finally, we did not perform a probabilistic sensitivity analysis. Given the large number of strategies that has to be evaluated for each draw, such an analysis would require a huge computational effort. We believe that simulating the range of varying strategies is one of the strengths of this analysis, because we were primarily interested in the comparison of different FIT screening strategies with varying numbers of samples provided, FIT cutoff levels, screening intervals and age ranges. Regardless of this, data on the probability distributions of most of the parameter values are lacking, which makes the interpretation of a probabilistic sensitivity analysis difficult and the outcome of limited added value. One of the most uncertain assumptions of the model is that all CRC arise from adenoma precursors. For FIT screening, this assumption will have limited impact because FIT has a low sensitivity for adenomas. In addition, the assumption of nonbleeding (and therefore for FIT undetectable) adenomas was evaluated in the sensitivity analysis by assuming correlation between falsenegative results.
In conclusion, our analysis provides new insights for decisionmakers; in a situation in which attendance to screening does not differ between strategies, intensifying screening with onesample FIT was found to be more costeffective than providing two FIT samples within one screening round. It is therefore recommended to increase the number of screening rounds with onesample FIT screening, before considering increasing the number of FIT samples provided per screening round.
Footnotes

Funding This trial was funded by the Dutch Cancer Society (EMCR 20063673), the Dutch Ministry of Health, Health Care Prevention Program–Implementation (ZonMw 63300022 and ZonMw 120720011), Olympus Medical Systems Europe GmbH, Hamburg, Germany, the Jacoba Foundation and Eiken Chemical Co., Tokyo, Japan. The funding sources had no influence on study design, data collection, monitoring, analysis and interpretation of results or the decision to submit the manuscript for publication.

Competing interests None.

Provenance and peer review Not commissioned; externally peer reviewed.
This is an openaccess article distributed under the terms of the Creative Commons Attribution Noncommercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/bync/3.0/ and http://creativecommons.org/licenses/bync/3.0/legalcode