Article Text

Faecal immunochemical tests versus guaiac faecal occult blood tests: what clinicians and colorectal cancer screening programme organisers need to know
1. Jill Tinmouth1,
2. Iris Lansdorp-Vogelaar2,
3. James E Allison3
1. 1Department of Medicine, Sunnybrook Health Sciences Centre and University of Toronto, Toronto, Canada
2. 2Department of Public Health, Erasmus MC University Medical Center, Rotterdam, The Netherlands
3. 3Division of Gastroenterology, University of California San Francisco, San Francisco, California, USA
1. Correspondence to Dr Jill Tinmouth, Sunnybrook Health Sciences Centre, 2075 Bayview Ave Rm HG40, Toronto, Ontario, Canada M4N 3M5; jill.tinmouth{at}sunnybrook.ca

## Abstract

• COLORECTAL CANCER SCREENING
• CANCER PREVENTION
View Full Text

## Background

### CRC burden

Colorectal cancer (CRC) is an important health problem; it is the third most common cause of cancer-related death in women and fourth among men globally.1 Given its considerable impact, there is a need to develop and systematically implement strategies to reduce the risk of developing and dying from CRC. Fortunately, effective interventions to reduce these risks exist, including screening.2–9

### Principles of screening

Screening aims to detect cancer or precancer before symptoms appear and when they are more likely to be curable. Tests used for screening should be ‘simple, safe, precise and validated’.10 Safety is particularly important because screening tests, unlike diagnostic tests, are being used in asymptomatic populations11 whose pretest probability of disease is low. The selected test should minimise the chances of making a well person unwell. The average individual values safety, which may inform their decision to be screened. This issue was highlighted during the recent launch of the Dutch faecal immunochemical test (FIT)-based CRC screening programme, when newspaper articles reported on the risks of colonoscopy resulting from FIT.12 ,13 Safety and simplicity favour faecal tests for haemoglobin over colonoscopy for CRC screening, which has been promoted as the best screening test by some opinion leaders.14

Generally, screening tests risk stratify (ie, identify a subgroup with a higher probability of disease for more invasive testing) while diagnostic tests are used when disease is suspected on clinical grounds (ie, where the pretest probability has already been determined to be high and disease needs to be ruled in or out). A good diagnostic test may not necessarily be the ideal screening test.15 Clinicians however, may be more comfortable using a diagnostic test (such as colonoscopy) for screening when they perceive that the performance of the screening test (such as guaiac-based faecal occult blood tests (gFOBT)) is poor. A FIT is superior to gFOBT in this regard; its test characteristics are comparable with other commonly accepted screening tests such as the cervical Pap smear and mammography.

Clinicians should be familiar with the differences between opportunistic and organised screening (table 1).16 The way that the target population is invited is a key distinguishing factor. In organised screening, invitations are systematically issued to an entire target population using a centralised registry. In opportunistic screening, the invitation is sporadic, often occurring during an encounter with a healthcare provider or at the request of the individual. Organised programmes are generally implemented geographically (ie, inviting the eligible population of a region or a country), however, some American health maintenance organisations (eg, Kaiser Permanente) have programmes that also meet the definition of organised CRC screening. CRC screening is now being increasingly performed in an organised fashion in many countries in the developed world.17 In most of these organised programmes, faecal tests for haemoglobin have been selected as the primary screening test. Among the various faecal tests, recent guidance recommends FIT over gFOBT or high sensitivity gFOBT (sFOBT).18–20

Table 1

Similarities and differences between aspects of organised and opportunistic screening (reproduced with permission from Miles et al 16)

### Tests for faecal haemoglobin: how do they work?

The guaiac-based method of detecting occult blood in the faeces was the earliest approach to CRC screening.21 This approach relies on the pseudoperoxidase activity of haem, which facilitates oxidisation of guaiac when hydrogen peroxide is added. For gFOBT, ‘dry’ collection is used where two faecal samples from each of three separate stools are placed on a card (figure 1). Although gFOBT have good clinical specificity, they have low analytical and clinical sensitivity; that is, they do not detect haemoglobin concentrations below approximately 600 µg/g faeces (analytical sensitivity)22 ,23 and on one time testing, they have relatively high false negative rates for detecting CRC (clinical sensitivity). Furthermore, gFOBT are susceptible to interference from some foods and drugs,23 leading to the recommendation for diet and medication restriction during faecal sample collection. However, as clinical benefit is questionable and participation may be adversely affected, only some24 but not all25 CRC screening programmes make this recommendation. In some trials, rehydrated gFOBT were studied;26 rehydration reduces the false negative rate (improves sensitivity) while increasing the false positive rate (reduces specificity). In addition, there are sFOBT that have similar test characteristics to rehydrated gFOBT.27–29 sFOBT have an enhancer, which allows detection of lower peroxidase activity, conferring greater analytical sensitivity than traditional gFOBT.

Figure 1

Guaiac-based faecal occult blood test card: there are six windows in total; participants place two faecal samples from each of three separate stools using a wooden applicator.

FIT are immunoassays specific for human haemoglobin, forming an antibody-antigen complex with its globin moiety.25 Typically, one or two faecal samples are collected and no dietary restriction is required. A variety of methods may be used to detect the antibody-antigen complexes. Some FIT are qualitative, providing an end point that is read as positive or negative by eye if the faecal haemoglobin concentration exceeds a manufacturer-specified threshold. Generally, qualitative FIT use a lateral flow immunochromatographic approach similar to other point-of-care tests, such as at-home pregnancy tests (figure 2). Qualitative FIT use ‘dry’ or ‘wet’ (see below) collection methods. There are inherent differences among qualitative FIT, particularly as each manufacturer sets their own threshold for a positive test. In addition, the need for visual interpretation of the results (rather than a numerical result) introduces interobserver variation. Not surprisingly, qualitative FIT differ substantially in performance.30 ,31

Figure 2

Depiction of lateral flow immunochromatographic analysis used in point-of-care faecal immunochemical tests for haemoglobin (reproduced with permission from Allison et al18).

Quantitative FIT, by contrast, use immunoturbidimetric methods to measure the actual concentration of faecal haemoglobin. Quantitative FIT typically rely on ‘wet’ collection where faeces are sampled using specimen collection devices (comprising a probe or a brush and a vial that contains a buffer) that are designed for direct sampling on the analytical systems (figure 3). The globin may degrade between sample collection and analysis if stabilisers are not added to the buffer, leading to loss of signal. Automated analysis allows for high volume, standardised processing, reducing performance variation although not eliminating it completely. Other sources of performance variation within and across available quantitative FIT result from the ability of the devices to reliably collect standard quantities of faeces, the differences in the capacities of the buffers to stabilise the haemoglobin over time and temperature, the changing composition of buffers over time as well as the analytical technique used (eg, latex vs colloidal gold agglutination). Moreover, as the antibodies to haemoglobin and the epitopes they detect vary, different quantitative FIT, using the same quantitative threshold (measured in µg haemoglobin/g faeces, see below), can differ in terms of performance.32 Therefore, not all FIT are the same; in fact, the differences between some are substantial enough to lead to important differences in clinical performance.30–34

Figure 3

Faecal immunochemical test kit: probe is inserted into the stool then placed into a vial containing a haemoglobin stabilising buffer for transport to the laboratory.

### The evidence for gFOBT in preventing death from CRC

Three landmark randomised controlled trials were published in the 1990s from the USA, UK and Denmark demonstrating that screening with gFOBT prevents death from CRC in average risk persons.3 ,4 ,7 Each of these trials demonstrated a mortality benefit while the American trial, which unlike the others, predominantly used rehydrated gFOBT, also demonstrated a reduction in the incidence of CRC.8 A subsequent meta-analysis, which included these trials, reported a 15% reduction in CRC-death among those randomised to screening with gFOBT compared with controls but failed to demonstrate a change in CRC incidence.35

### The evidence for FIT superiority over gFOBT for organised CRC screening

To date, there are no controlled trials that demonstrate that FIT are superior to gFOBT or to no screening in terms of reducing CRC-related mortality in average risk persons. However, a recent observational study from Italy demonstrated a reduction in CRC-related mortality in regions where screening with FIT was adopted compared with regions where screening had not yet been implemented.36

Using different laboratory approaches, FIT and gFOBT identify components of an accepted biomarker (blood) that has been shown to reduce CRC-related mortality. Experts have defined criteria to validate a new CRC screening test in the absence of controlled trials;37 ,38 specifically, that there is convincing evidence that the new test has: (1) at least comparable performance (eg, sensitivity and specificity) in detecting CRCs and adenomas;37 (2) is equally acceptable to patients and (3) has comparable or lower complication rates and costs.38 FIT meet these criteria: compared with gFOBT, FIT have better clinical and analytical sensitivity and greater detection of CRC precursors, improve participation, and are cost-effective. As such, it is highly likely that FIT will have an important effect on CRC incidence and mortality beyond that achievable with gFOBT.

#### Participation

Participation with CRC screening is critical to its effectiveness; in fact, microsimulation studies of screening tests have shown that small gains in participation can offset large differences in efficacy.39 ,40 Improved screenee participation with FIT compared with gFOBT or sFOBT is therefore important. Five population-based randomised controlled trials in persons at average risk for CRC comparing these faecal tests (table 2) found an absolute increase in participation ranging from 5.4% to 16.2%41–45 while a sixth trial29 from Israel found greater participation with gFOBT with an absolute difference of 2.9%. A meta-analysis of the studies comparing FIT to gFOBT reported better participation with FIT (RR: 1.16, 95% CI 1.03 to 1.3).46 Although not compared head to head, participation in organised screening programmes using FIT appears better than in similar programmes using gFOBT.47–49 FIT may also reduce participation gaps in vulnerable populations.48

Table 2

Characteristics and results from selected population-based randomised controlled trials examining participation among participants receiving FIT and traditional gFOBT or high sensitivity gFOBT (sFOBT)

In the six controlled studies described above (table 2), participants were invited by mail; the test kits (FIT and gFOBT or sFOBT) were included with the invitation in three of the studies. Superior participation with FIT seemed to correlate with the smaller number of faecal samples required. Dietary/medication restrictions were imposed for those randomised to the gFOBT in four studies; none of the studies restricted diet or medications for those randomised to FIT. These differences may be responsible for greater screenee participation with FIT but do not appear to compromise its accuracy. Additionally, the collection method for FIT compared with that for gFOBT may also assist with acceptability as it reduces contact with faeces and may appear more scientific.18 ,45

#### Test characteristics

A recent meta-analysis of 19 studies reported the sensitivity and specificity of FIT for the detection of CRC.50 Studies were restricted to those that studied average-risk asymptomatic populations and used a reference standard of colonoscopy or follow-up via registry/medical records for two or more years. The overall pooled sensitivity and specificity of FIT for CRC were 79% (95% CI 69% to 86%) and 94% (95% CI 92% to 95%), with an overall accuracy of 95% (95% CI 93% to 97%) (figure 4).

Figure 4

Pooled sensitivity and specificity for faecal immunochemical tests for colorectal cancer from meta-analysis of studies of asymptomatic, average-risk persons undergoing an appropriate reference standard (reproduced with permission from Lee et al50).

Cohort studies have compared the use of one time gFOBT and FIT in the same average-risk asymptomatic individuals using a standard outcome (colonoscopic evaluation and/or follow-up via medical records and telephone) (table 3). In the studies that compared traditional gFOBT to FIT,28 ,51–53 the absolute increase in sensitivity for CRC of FIT relative to gFOBT ranged from 31.7% to 61.5%. Specificity was maintained in three of the four studies reviewed;28 ,51 ,52 in the last study, it was notably poorer, likely because a lower positivity threshold was used.53 Generally, with FIT, a smaller number of colonoscopies would be needed to detect one CRC than for gFOBT, although this number is also sensitive to the FIT positivity threshold. In the study comparing a sFOBT to a FIT not currently on the market,28 specificity of the FIT was better than for the sFOBT, but sensitivity was worse. Therefore, currently available FIT appears to perform better than gFOBT for CRC detection.

Table 3

Results from selected cohort studies comparing test characteristics of one-time FIT versus traditional gFOBT or high sensitivity gFOBT (sFOBT) for detection of CRC (as determined via full colonoscopy or follow-up via medical records or telephone) in average-risk individuals

#### Detection of advanced colorectal neoplasms

A meta-analysis of randomised controlled trials comparing gFOBT to FIT found that FIT detect more than twice as many CRCs and advanced adenomas (RR: 2.28, 95% CI 1.68 to 3.10).46 In cohort studies comparing gFOBT and FIT where all patients had colonoscopy, FIT also detected approximately twice as many CRCs and advanced adenomas than gFOBT51 ,52 ,54 and in two of the three studies, fewer colonoscopies were required to detect one advanced lesion.51 ,52 ,54 The stage distribution of CRC is improved in those screened with FIT relative to those who are not screened at all47 and to those that are screened with gFOBT.55 FIT are better at detecting the immediate precursors to CRC (advanced adenomas) suggesting that, unlike gFOBT they may have an impact on CRC incidence.

### The evidence comparing FIT with endoscopy

There are no published trials to date comparing FIT with endoscopy for the outcome of CRC-related mortality. There are randomised controlled trials that compare one-time FIT to endoscopy for the lesser outcomes of participation and detection of cancer or precancerous lesions. The sensitivity of a one-time FIT is called the ‘application’ sensitivity while the sensitivity of repeated testing is sometimes referred to as ‘programmatic’ sensitivity.56 In a ‘real world’ setting, FIT are implemented serially over time (ie, every 1–2 years) and as such the programmatic sensitivity is a better measure of actual performance than the one-time application sensitivity reported in trials to date. Two large, well designed randomised trials comparing serial FIT to endoscopy over time for the outcome of CRC mortality are underway and will address this limitation.57 ,58

#### FIT versus flexible sigmoidoscopy

Two randomised controlled trials have compared one-time FIT with flexible sigmoidoscopy for CRC screening in average risk populations59 ,60 for the outcomes of participation and detection of advanced colorectal neoplasms. Participation with FIT was nearly twice that of flexible sigmoidoscopy in the Dutch trial59 while it was essentially the same in the Italian trial;60 however, in the Dutch trial the FIT kit was sent with the letter of invitation while in the Italian trial, participants received a letter asking them to contact their physician to obtain a kit. As expected, flexible sigmoidoscopy detected more advanced colorectal neoplasms than FIT among those who completed the screening test. However, participation had a strong effect on detection rates. In the Dutch trial where there was better participation, 14 CRCs were detected among those invited for FIT (n=5007) and 8 among those invited for flexible sigmoidoscopy (n=5000) although more advanced adenomas were detected with flexible sigmoidoscopy (103 vs 59 with FIT).

#### FIT versus colonoscopy

One-time FIT have been compared with colonoscopy in three randomised trials performed in average risk populations.58 ,60 ,61 Similar to flexible sigmoidoscopy, participation with FIT was generally better than with colonoscopy (two58 ,61 of the three trials), which had impact on CRC detection rates particularly. More advanced adenomas were detected among those invited to colonoscopy than to FIT in all three trials. However, in the Spanish study,58 there were 32 CRCs detected among those invited for FIT (n=26 703) compared with 30 among those invited for colonoscopy (n=26 599), largely because of greater participation among those randomised to FIT. Recently, a well designed equivalency trial showed that FIT implemented intensively can perform as well as colonoscopy among persons with a first degree relative with CRC for the detection of advanced colorectal neoplasms. In this trial, FIT was offered annually at a low positivity threshold.62

In organised screening programmes, the choice of screening test must be informed by its acceptability and its accuracy as greater participation may trump even large differences in efficacy.39 ,40 While some have raised the issue of missed adenomas in a stool based programme, this concern is likely misplaced. A quarter of the US population have adenomas by age 50 years and up to half of individuals will develop an adenoma in their lifetime; however, most people die with and not from their adenomas.63 ,64 Unfortunately, the currently available data is insufficient to decide this question; the final results of the Spanish study58 are needed, which compares the efficacy of one-time colonoscopy to biennial FIT for the reduction of CRC mortality at 10 years, and a similar American VA study57 due to report final results in the 2020s.

### Cost-effectiveness

Several studies have investigated the cost-effectiveness of screening with FIT compared with gFOBT. All studies found FIT to be cost-effective compared with traditional gFOBT.65 Some studies even found FIT to be cost-saving compared with no screening, because of the large savings from preventing treatment of advanced CRC.66–68 However, when sFOBT were compared with FIT, they were found to be more cost-effective than FIT, assuming significantly lower costs for sFOBT than for FIT ($4.50 compared with$22) based on Medicare reimbursement rates. However, at a cost of $17.25 or less per test, FIT become a cost-effective alternative to sFOBT (at a per-test cost of$4.50).69 In most jurisdictions outside the USA, the unit costs of gFOBT and FIT are very similar, suggesting that FIT are indeed a cost-effective alternative to either traditional gFOBT or sFOBT.70–72 It is anticipated the costs for FIT will decrease in the USA as costs are subject to competition and better studied FIT have become available since the Medicare reimbursement decision.

Despite the attractiveness of FIT screening from a cost-effectiveness perspective, an often-cited reason not to implement FIT screening is the lack of colonoscopy capacity. At the manufacturer-recommend cut-off, FIT tend to have a higher positivity rate than gFOBT. Consequently, twice as many colonoscopies are required with FIT screening than with gFOBT.20 However, with quantitative FIT, it is possible to increase the cut-off for a positive test, thereby reducing the number of positive tests and the required colonoscopy resources.51 One cost-effectiveness analysis specifically addressed the cost-effectiveness of FIT screening in a setting of limited colonoscopy capacity. This study showed that FIT at a higher positivity threshold (because colonoscopy capacity is limited) remains more effective and cost-effective than gFOBT.73

In countries such as the USA, some experts have recommended colonoscopy over FIT and other faecal tests because of the belief that ‘CRC prevention should be the primary goal of CRC screening’.74 With faecal tests, it was felt that prevention (via polypectomy for detected adenomas) was incidental and not the primary goal of the test. However, modelling studies have shown that at equal and high adherence levels, annual FIT screening is equally effective in reducing CRC incidence and mortality as 10-yearly colonoscopy screening75 and one study even showed FIT to dominate colonoscopy from a cost-effectiveness perspective.68

### Implementing FIT in organised CRC screening programmes

Once the decision has been made to use FIT as a CRC screening tool, a variety of additional factors must be addressed for implementation. There are factors that must be considered during kit selection (eg, qualitative or quantitative kits) and others that must be considered when using the test or designing the programme (eg, selecting a positivity threshold if using a quantitative test, frequency of testing). In large-scale or organised screening programmes, pilot studies using FIT may be useful in assisting with implementation,19 ,20 implementation itself should be rigorously evaluated and findings from these studies and evaluations should be widely disseminated.

#### Qualitative versus quantitative FIT

As noted above, FIT can be divided into those that provide a binary result (qualitative) and those that provide a numeric result that is interpreted relative to a threshold defining a positive test (quantitative). There is a growing literature indicating important differences in quality and performance across qualitative kit brands.30 ,31 ,33 Despite this literature, in the USA, qualitative FIT (mostly marketed as point-of-care tests) are Food and Drug Administration approved while quantitative are not. Unfortunately, in the USA, these qualitative FIT are often approved for use with minimal supporting data in average-risk populations as they are waived under Clinical Laboratory Improvement Amendments of 1988.18 Some have suggested that qualitative kits can be used in organised screening programmes, however, there is growing consensus that high quality quantitative FIT are preferred for organised screening.18–20 The advantages of using quantitative over qualitative FIT in this context include supporting data from large studies in average-risk persons (there are no similar studies using qualitative FIT), less interobserver variability, opportunity for laboratory quality control, improved laboratory efficiency due to automation of kit processing and the ability to customise the positivity threshold.

#### Quantifying results: haemoglobin concentration and positivity cut-offs

The ability to quantify haemoglobin concentration is one of the most important distinguishing characteristics of FIT compared with gFOBT and makes it particularly attractive to organised screening programmes. Haemoglobin concentration is commonly reported in the literature as nanograms of haemoglobin per millilitre of buffer (ng haemoglobin/mL buffer). This practice has been criticised76 as the mass of faeces collected and the volume of buffer in the vial varies across FIT brands. When units are reported in ng haemoglobin/mL buffer, results may vary by kit brand even if the concentration of haemoglobin in the faeces is the same (eg, if ng haemoglobin/mL buffer is used, kit ‘A’, which collects 10 mg of faeces into 2 mL of buffer, would report a haemoglobin concentration twice that of kit ‘B’, which collects 5 mg of faeces into 2 mL of buffer). As such, reporting in ng haemoglobin/mL buffer limits comparison across kits.

Experts have recommended standardisation using units expressed as the mass of haemoglobin in micrograms per mass of faeces in g (µg haemoglobin/g faeces).56 Concentrations can be converted from ng haemoglobin/mL buffer to µg haemoglobin/g faeces using an analytical system-specific multiplier. For example, the multiplier for one of the most commonly studied FIT, OC-Sensor, is 0.2. Using the multiplier, the manufacturer's recommended positivity threshold of 100 ng haemoglobin/mL buffer for the OC-Sensor becomes 20 µg haemoglobin/g faeces. The need for standardisation is illustrated in a recent study which used four different qualitative FIT kits, all which defined positive as 50 ng haemoglobin/mL buffer.31 However, the actual haemoglobin concentration detected varied from 6 µg haemoglobin/g faeces to 50 µg haemoglobin/g faeces. Not surprisingly, there was considerable variation in the clinical performance across the kits.

Quantification of haemoglobin concentration allows customisation of the positivity threshold. In the recently published meta-analysis on FIT test characteristics, the sensitivity of FIT improved with a corresponding decrease in specificity when the positivity threshold was lowered.50 In studies that used a cut-off of 20 µg haemoglobin/g faeces or less, pooled sensitivity was 86% while pooled specificity was 91%. The positivity threshold can be customised to local programme needs. In the Netherlands after the programme launch, evaluators detected a higher than expected positivity rate and a lower positive predictive value. After considering a number of solutions, the positivity threshold was raised as it had the greatest impact on colonoscopy capacity while minimising any losses in terms of lives saved.77 In Italy, during the hotter months of the year when there are concerns of haemoglobin degradation, it was suggested that a lower positivity threshold be used.78

While no single positivity threshold can perfectly distinguish those with more advanced lesions, it is clear that the mean haemoglobin concentration is higher in those with more advanced colorectal lesions (figure 5).79 This FIT characteristic can be leveraged in several ways, including incorporation into sophisticated risk scores used to more efficiently stratify screening populations for colonoscopy.80 Another application might be to use faecal haemoglobin concentration to determine the interval to subsequent screening in those who are FIT-positive but colonoscopy-negative. That is, those with higher concentrations could be recalled earlier for subsequent screening.

Figure 5

Histograms of haemoglobin concentration by severity of colorectal lesion (reproduced with permission from Auge et al79).

#### Number and frequency of samples

Three separate stools are sampled to complete a gFOBT while one or two faecal samples are typically collected for the FIT. Different strategies can be used to define a positive FIT when two samples are collected: (1) at least one of the two samples positive, (2) both samples positive, or (3) the mean of the two samples above a predefined positivity threshold.81–83 In a Dutch study that compared one versus two samples (table 4),83 there was no difference in participation although for a given positivity cut-off, there were differences in positivity rates, detection rates and colonoscopy utilisation (at least one of two samples positive >one sample FIT >two of two samples positive).

Table 4

Test characteristics of different FIT screening strategies (cut-off value, 50 ng Hb/mL) (adapted from van Roon et al 83)

Based on the above findings, in deciding between one and two sample testing, cost and colonoscopy capacity must be considered. It has been proposed that a two sample FIT requiring at least one of two samples positive could be used in systems with high colonoscopy capacity while two sample testing with both samples positive could be used in resource-limited settings.83 However, the field is moving towards one sample FIT as it is more appealing in terms of cost-effectiveness and participation. In settings where colonoscopy capacity is moderate, one sample FIT testing is cheaper compared with the two sample strategies while maintaining equal effectiveness.83 Microsimulation modelling has shown that a one sample FIT needs be used more intensively however (a low positivity threshold at a higher screening frequency (either more often or over a larger age range)), in order to be more cost-effective than two sample tests.84

#### Haemoglobin stability

The globin moiety of haemoglobin is susceptible to enzymatic degradation, either endogenous or microbial. FIT depends on a reaction between antibodies and the globin moiety; hence, stabilising buffers may be used to minimise degradation during the time from sample collection to analysis. The issue of haemoglobin stability is particularly important for organised screening programmes where FIT are processed centrally (for quality control in interpreting results, central capture of results, etc). Participants are generally asked to return their completed FIT by mail and days to weeks may pass between sample collection and processing.

Despite the use of buffers, haemoglobin degradation may be accelerated under certain conditions leading to false-negative results. Haemoglobin concentration has been shown to degrade over time;85 as a result, a shorter return time is required in many organised programmes using FIT compared with those using gFOBT. The effect of high temperatures is well described33 ,78 ,85–87 and was reported in the Italian organised screening programme in a retrospective analysis.78 In this study, the positivity rate declined in the summer, reducing the probability of detecting an advanced neoplasm by 13% in the summer compared with winter. Similar issues were reported in the Australian Bowel Cancer Screening Program.88 Freezing also has an impact on haemoglobin stability although it appears that it is the process of freezing then thawing that leads to haemoglobin degradation rather than time in a frozen condition.33 Therefore if there is to be a significant delay between sample collection and analysis, refrigeration is recommended.

As different manufacturers use different buffers, it is not surprising that haemoglobin stability varies across FIT brands.33 ,82 ,89 The clinical significance of these differences is unknown but in some studies,33 ,82 the differences seem larger than in others.89 The differences in these findings support the recent call for a standardised methodology to assess haemoglobin stability so as to allow for easier comparison of this property across FIT brands.76

#### Kit choice

It is clear that there is important variation in performance across FIT brands.30–33 ,82 ,89 ,90 Some differences may be attributed to the type of kit, qualitative or quantitative, but even within these types and even when controlling for the concentration of haemoglobin detected,32 clinically important variation exists. Therefore, due diligence is required when selecting a FIT for use in clinical practice or in an organised screening programme. Special care should be taken if selecting FIT for which there is little evidence on clinical performance in large samples of average risk populations. In these circumstances, rigorous evaluation of results is warranted to ensure that clinical performance and quality is comparable with other validated FIT.

In addition, local factors including seasonal variation in temperatures, the postal system (for FIT return), and any pre-existing screening programme or strategies must be considered as a part of FIT selection. Finally, it is worth noting that over time, even the ‘same’ FIT may change. For example, positivity rates in Scotland and Belgium were higher than seen in pilot studies, despite using the ‘same’ test.77 Buffers and even latex have changed over time, which may alter test characteristics; thus it is important to continue monitoring after implementation. Findings from these evaluations should be widely disseminated, especially when they involve kits for which there is little published data.

#### FIT use outside of organised screening

In jurisdictions where organised screening is not available, FIT may also be used either through opportunistic screening or via other screening initiatives. An example of one such innovative and successful programme in the USA, is the FLU-FIT programme where FIT was offered in annual influenza vaccine clinics, resulting in increased adherence with FIT.91–94

## Conclusions

The evidence is now sufficient to support the use of FIT over gFOBT for CRC screening. Despite the lack of randomised trials, there is recent observational data demonstrating CRC-related mortality benefits and there are a number of properties specific to FIT that confer important advantages over gFOBT. As noted above, participation or uptake of screening is critical to its effectiveness. A clear strength of FIT over other screening approaches that are currently available is that in most head-to-head comparisons (with gFOBT or endoscopy), patients seem to prefer them. In addition, FIT are suitable for direct mailing to the patient which makes them amenable for large-scale distribution. Among FIT, quantitative FIT are preferred over qualitative FIT for organised CRC screening programmes.18–20 Early data from randomised controlled trials comparing FIT with colonoscopy are also compelling; it is possible that regular use of FIT over multiple years could be comparable to colonoscopy in terms of effectiveness. The results of two, large, randomised controlled trials are eagerly awaited and will be very informative in this regard. In addition, advances with FIT itself are anticipated, including improved stability, as better buffers are developed as well as innovative ways of using it in clinical practice such as in clinical risk indices. As FIT are rolled out in large populations, it is imperative to continue to monitor performance closely in order to ensure high quality screening. As well, during implementation of FIT in these populations, consideration should be given to head-to-head comparison of different FIT (ideally in a randomised fashion) to determine which are best, particularly where there is a paucity of evidence to support the use of a particular FIT.

View Abstract

## Footnotes

• Contributors JT, IL-V and JEA all contributed to the conception and design of the work. JT drafted the work while IL-V and JEA revised it critically for important intellectual content. All authors gave final approval of the version published. All authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

• Funding JT was supported by a Canadian Institutes of Health Research New Investigator Award (HSH-104705) during the period of this study. IL-V was financially supported by the National Cancer Institute at the National Institutes of Health during the period of this study (U01-CA-152959).

• Competing interests None declared.

• Provenance and peer review Commissioned; externally peer reviewed.

## Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.