Article Text
Abstract
Background Screening colonoscopy (SC) outcome quality is best determined by the adenoma detection rate (ADR). The substantial variability in the ADRs between endoscopists may reflect different skills, experience and/or equipment.
Objective To analyse the potential factors that may influence ADR variance, including case volume.
Design 12 134 consecutive SCs (mean age 64.5 years, 47% men) from 21 Berlin private-practice colonoscopists were prospectively studied during 18 months. The data were analysed using a two-level mixed linear model to adequately address the characteristics of patients and colonoscopists. The ADR was regressed after considering the following factors: sex, age, bowel cleanliness, NSAID intake, annual SC case volume, lifetime experience, instrument withdrawal times, instrument generations used, and the number of annual continuing medical education (CME) meetings attended by the physician. The case volume was also retrospectively analysed from the 2007 national SC registry data (312 903 colonoscopies and 1004 colonoscopists).
Results The patient factors that correlated with the ADR were sex, age (p<0.001) and low quality of bowel preparation (p=0.005). The factors that were related to the colonoscopists were the number of CME meetings attended (p=0.012) and instrument generation (p=0.001); these factors accounted for approximately 40% of the interphysician variability. Within a narrow range (6–11 min), the withdrawal time was not correlated with the ADR. Annual screening case volume did not correlate with the ADR, and this finding was confirmed by the German registry data.
Conclusions The outcome quality of screening colonoscopies is mainly influenced by individual colonoscopist factors (ie, CME activities) and instrument quality.
Clinical trial registration number Clinical Trial Gov Registration number: NCT00860665.
- Screening colonoscopy
- quality assurance
- quality factors
- adenoma detection rate
- barretts oesophagus
- endoscopy
- neuropeptides
- adhesion molecules
- neuroendocrine cells
- somatostatin
- neuroendocrine tumours
- colonoscopy
- colonic polyps
- endoscopic retrograde pancreatography
- endoscopic sphincterotomy
- endoscopic ultrasonography
Statistics from Altmetric.com
- Screening colonoscopy
- quality assurance
- quality factors
- adenoma detection rate
- barretts oesophagus
- endoscopy
- neuropeptides
- adhesion molecules
- neuroendocrine cells
- somatostatin
- neuroendocrine tumours
- colonoscopy
- colonic polyps
- endoscopic retrograde pancreatography
- endoscopic sphincterotomy
- endoscopic ultrasonography
Significance of this study
What is already known on this subject?
-
Screening colonoscopy (SC) has been established as an effective modality for colorectal cancer screening, and the determination of the adenoma detection rate is the main outcome parameter.
-
The adenoma rate is highly variable, but the factors for this variability are mostly unknown. Indeed, only instrument withdrawal time has previously been shown to be important.
-
Other parameters, such as case volume, have not been examined. In Germany, a minimum annual number of 200 patients per physician is required for SC performance.
What are the new findings?
-
The present study did not find a correlation between the adenoma detection rate and case volume.
-
Both of the individual endoscopist factors (ie, continuing medical education and endoscope quality) appear to play a greater role and could account for approximately 40% of the variability.
-
The present study could not confirm the correlation between the withdrawal time and the adenoma detection rate within a narrow time range (6–11 min).
How might it impact on clinical practice in the foreseeable future?
-
A cut-off value of 200 annual colonoscopies appears to be sufficient to guarantee the stability of screening colonoscopy quality independent of case volume.
-
Both the continued education of the endoscopist and the instrument generation must be taken into account in efforts to improve SC quality.
Introduction
Screening colonoscopies have been shown to decrease colorectal cancer (CRC) incidence and mortality1 ,2 by detecting cancers at an earlier stage, which can lead to the removal of adenomas as precursor lesions. These facts led to the introduction of screening colonoscopy in countries such as the USA and Germany. In Germany, reimbursement for screening colonoscopy (SC) since the end of 2002 on a national level has been linked to a general quality assurance programme that includes hygienic and documentation controls as well as minimal annual colonoscopy numbers (>200). In addition, the performance data are centrally registered. These measures have led to the exclusion of non-specialists who perform low numbers of SCs. Above the cut-off of 200 annual colonoscopies, however, there is still a wide variability of case volume between colonoscopists. Recent European guidelines that focused on the quality of CRC screening measures, including SC, determined an annual minimum of 300 colonoscopies per colonoscopist.3
Several quality parameters for colonoscopy have been defined and analysed in studies, such as caecal intubation and complication rates. With regard to outcome quality, the adenoma detection rate (ADR) is commonly considered as the main quality outcome parameter of SCs.4 ,5 This was recently confirmed by a large follow-up study from Poland.6 Adenoma detection rates from various countries have ranged from 8% to 35% (overview in Adler et al),7 and it is not fully known whether these differences mainly reflect differences in the quality of the screening or differences in the disease prevalence. Although several factors have been analysed for their ability to affect ADRs, only colonoscopy withdrawal times have been shown to correlate with the ADR8–11; however, this factor has also been challenged.12
Several factors related to both the colonoscopist and the quality of the endoscope that is used in the screening could explain the differences in the ADRs. In addition, the case volume of the colonoscopist could also play a role. In several areas of medicine, especially surgery, case volume has been shown to correlate with outcome, with respect to both individual and institutional volume.13–19 In contrast, respective data for gastrointestinal endoscopy are limited and only available with respect to endoscopic retrograde cholangiopancreatography.20–23 Such case volume data are not available for colonoscopies, including screening colonoscopies. In general, case load could affect quality in both directions. For example, a high colonoscopy case load could be a sign of substantial experience and could lead to better outcomes (this has been shown for surgery). In contrast, high throughput could also lead to quicker examination times, less attentiveness and a lower yield for polyp detection.
Therefore, the current prospective study in Berlin analysed potential influences on the variability of the ADRs between colonoscopists, with respect to both the colonoscopist and the endoscope used. The annual case volume was also included as a potential correlating parameter. In addition, case volume data were corroborated by analysing data from the national SC registry of the main study year 2007.
Patients and methods
Between October 2006 and March 2008, 21 gastroenterologists (from 18 private practices in Berlin) who were licensed to perform screening colonoscopies (called colonoscopists in the following) performed a prospective quality assessment study on various performance parameters, findings, complications and patient acceptance (by means of questionnaires). All of the individuals who were willing to undergo a screening colonoscopy were asked whether they would like to participate in this quality assurance study and were required to give their informed consent. The study was approved by the Charité Ethical Committee (EA 02/019/07). Several audit rounds at 2–6 month intervals and after the study termination were performed by a study nurse with the help of four research assistants from each of the practices.
The main outcome parameter was the ADR, which was defined as the percentage of patients with at least one adenoma. During the study, the number of carcinomas, adenomas and hyperplastic polyps per patient were recorded, and we also assessed the location, form, size and histology of the tumours.
The following parameters were analysed with respect to their potential influence on the ADR.
Patient factors
-
Patient age and sex.
-
Non-steroidal anti-inflammatory agent (NSAID) intake.
-
Bowel cleanliness, which was scored from 1 (excellent) to 5 (insufficient) by the examiners. In detail, the following scores for colon cleanliness were agreed on among the study physicians in a meeting before the start of the study:
-
Excellent (no/hardly any residual fluid, small amounts of clear fluid that can easily be cleared by suction)
-
Sufficient (moderate amounts of residual fluid or liquid material that can be cleared by suction and does not require extra time >1 min)
-
Moderate (residual faecal material or fluid that can only be cleared with substantial effort through rinsing and suction; requires several minutes to clean)
-
Poor (residual faecal material that cannot be completely cleared, only larger elevated lesions can be excluded)
-
Insufficient (cannot be cleared by means of colonoscopy; colonoscopy should be repeated).
Colonoscopist factors
-
The annual case volume of screening colonoscopies during the study period.
-
The lifetime experience with colonoscopy reported by the participating physicians on the basis of their office software data and prior training and examinations performed in teaching hospitals before they entered a private practice.
-
The caecal intubation rate.
-
The examination times (introduction and withdrawal), including polypectomy, were measured by recording the times on conventional clocks or watches in 30 s intervals. For the analysis, only cases without polyps were considered, which was similar to a previous report.8
-
The number of continuing medical education (CME) meetings in the preceding 5 years before the start of the study. The number of CME points for each meeting is determined by the local chamber of physicians according to the length and the amount of practical training provided at the meeting. These points must be registered for licensing reasons at the local chamber of physicians.
Endoscope factors
The endoscopes (colonoscopes) that were used by the physicians during the study period were from Pentax (EC-3870, EC-3940, EC-3880FK, EC-3830FK2, EC-3840MK, EC-3840MK2, EC-380FKp, EC-380FK2p, EC-380LKp, EC-380MK2p) in 11 cases, Olympus (PCF-100, CF-145I, CF-Q145L, CF-Q165I, PCF-Q180Ai, CF-H180AI, CF-H180AL) in six cases, and Fujinon (EC 200 MR, EC-200WM2, EC 201 WI, EC-201WM, EC 250 WI5) in four cases. The endoscopes were divided into three generation categories according to manufacturer feedback: category I represented the latest generation colonoscopes at the time of the study (2006–2008), category II represented colonoscopes from a generation before the study, and category III represented colonoscopes from two generations before the study. For the physicians who simultaneously used different generations of instruments, a category called ‘mixed’ was introduced. There was no case-based documentation of instrument use.
Comparative case volume data of the German screening colonoscopy registry
Twenty-one colonoscopists participated in the present prospective study. To exclude any effects that may have been missed due to this limited number of colonoscopists, we retrospectively analysed the large German screening colonoscopy database from the year 2007 for comparison. Within the German screening colonoscopy quality assurance programme, documentation of relevant SC data is performed in a self-reported registry at the Central Research Institute of Ambulatory Health Care, Berlin (Zentralinstitut (ZI) der Kassenärztlichen Vereinigung (KV)). The documentation includes information about the completeness of the colonoscopy (caecal or ileal reach) and the histology of the polyps and cancers. From this registry, data collected in 2007 for 1004 individual colonoscopists performing 3 112 903 SCs were taken for comparison to analyse the correlation between annual case volume and the ADR.
Statistical analysis
Descriptive analysis of the full dataset consisted of absolute and relative frequencies in categorical variables and the means±SDs for the continuous variables. The frequencies of missing values are given in square brackets.
Adenoma detection rate is affected by the characteristics of the individual patients and by the skills and practices of the colonoscopists. To simultaneously study both aspects, we applied a mixed linear binary two-level model with adenoma detection as the outcome and the determined patient and colonoscopist characteristics as the regressors. The model allows for the estimation and test of associations by calculating the adjusted ORs with 95% CIs and corresponding p values. We used a forest plot to present the results (figure 1). A characteristic with a p value below 0.05 was judged to be a significant independent predictor.
A forest plot showing the results of fitting a two-level binary mixed linear model to the data of the complete cases (92%). The plot shows ORs with 95% CIs on a log-odds scale.
The mixed model also allows for an assessment of the heterogeneity of the detection rates. For this purpose, we also fitted a two-level model without regressors to calculate the intraclass correlation coefficient, which demonstrates the percentages by which adenoma detection depends on patient and colonoscopist individuality. We further fitted a model with patient characteristics as the only regressors. A comparison of the cluster variances of the three models provided us with estimates about the extent to which the heterogeneity between the colonoscopists can be explained by the factors in the regression model. Statistical model building was performed in the complete case population using procedure ‘xtlogit’ of Stata V.12.0.
Results
A total of 12 856 screening colonoscopies were registered by the participating physicians during the study period in the ZI registry, and 12 134 cases were included in the present prospective study. The remaining individuals did not provide consent for the study for a variety of reasons. Of the 12 134 cases, complete data for all of the parameters that were analysed were available in 11 166 cases (92%). Descriptive statistics concerning the patient, the examination and the examiner data are shown in table 1. The ADRs ranged from 7.5% to 33.3% (the mean was 21.7%). Interestingly, the withdrawal times in the cases without polyps ranged from 6 to 11 min, with only one extremely careful colonoscopist investing 17 min in each case without polyps. The caecal intubation rate was 98% (range 93–99%), and the mean instrument introduction time was 8.8±6.55 min. Complications were encountered in 0.46% of the cases.
Baseline demographic and clinical characteristics of the patients
Figure 1 shows the ORs that were associated with the variables or categories tested. Three groups of variables were discriminated:
-
The upper four variables, which correspond to patient characteristics, show that age and sex significantly correlated with the ADR (ie, the ADR was higher in men and increased with increasing age). Poor bowel preparation (assessed by the colonoscopists as scores IV and V) was also significantly correlated with the ADR. The correlation of the ADR with NSAID intake did not quite reach statistical significance.
-
The next variable was related to the endoscope technology (the generation of the instrument that was used). There was no difference between the latest generation and the generation before the study (I vs II), but there was a significant difference between the latest generation of instruments and the oldest instruments in use (I vs III).
-
The last four variables relate to the colonoscopist. The case volume (both the annual SC volume and the annual total colonoscopy volume), the withdrawal time (excluding polypectomy cases) and lifetime experience were not correlated with the ADR. The number of CME credit points, however, was correlated with the ADR. We also analysed the caecal intubation rate and determined that it was correlated with the ADR (data not shown).
In the model without regressors, the intraclass correlation coefficient was 4.1% (p<0.001), which indicated that the ADRs were significantly different between the colonoscopists. In addition, the colonoscopists' skills accounted for 4% of the total variability in adenoma detection. Differences in the patient characteristics could not explain the substantial differences between the colonoscopists; however, 41.4% of the ADR heterogeneity between the colonoscopists could be explained by the determined colonoscopist and the instrument characteristics shown in figure 1. Interestingly, the cause for a substantial percentage of the heterogeneity between the colonoscopists remains unknown (p<0.001).
The absence of a correlation between the ADR and the annual case volume was also confirmed by the 2007 analysis of the data within the German SC registry (figure 2).
The central German colonoscopy colorectal screening registry data from 2007, including 312 903 screening colonoscopies performed by 1004 colonoscopists. The figure also shows the correlation between the patient rates with at least one adenoma (ADR) and the annual case volumes.
Discussion
The present large prospective study focused on the factors that are responsible for highly variable adenoma detection rates. For the first time, we used advanced statistical modelling that takes the cluster structure of the data into account to simultaneously examine the factors that are related to patients and the factors that are related to colonoscopists and their instruments; thus, we avoided the potential biases due to masking effects or piggyback effects that are likely to arise when factors are tested without adjustment.
We found a substantial heterogeneity in the ADRs between colonoscopists that could partially be explained by covariates. We showed that there was no correlation between the ADR and either case volume or colonoscopy outcome quality, which was corroborated with an even larger database from the German national registry. Although case volume plays a role in the quality of surgical procedures,12–18 case volume did not play a major role in determining SC quality in our study. The present study was the first to examine the influence of case volume on the effectiveness of diagnostic endoscopy. Investigations of case volume in the area of gastrointestinal endoscopy have only been analysed and shown for endoscopic retrograde cholangiopancreatography (with respect to complications).19–22 For diagnostic tests, outcome quality parameters, such as the detection of findings (the ADR in the present study), are probably more important parameters than complications, especially because complications are not that common in SCs. The present results have to be interpreted within the German screening programme, which sets a cut-off of 200 colonoscopies as the minimal annual number for accreditation. It could be that even lower annual numbers may show some correlation with the ADR. However, because recent European guidelines set a higher minimum level of required annual colonoscopies (ie, 300),3 we can conclude that 200 annual colonoscopies are sufficient to guarantee a quality that is independent of case volume.
The present study showed that certain colonoscopist- and endoscope-related factors contributed to the variation in the adenoma detection rates. Although these factors explained a substantial proportion (approximately 40%) of the variation, they did not explain all of the differences between the colonoscopists. The remaining heterogeneity is likely due to differences in the colonoscopists' skill levels that could not be directly measured in the present study.
Only a small number of factors that may influence the ADR have been analysed and found to be relevant, which may be due to the limited case numbers in previous studies. Interestingly, instrument withdrawal is probably the only influence shown in some studies to correlate with the ADR.8–10 In the present observational study, we did not subtract biopsy or polypectomy time from the overall examination times, which was similar to previous studies. Thus, we only used withdrawal times from the cases without polyps for our correlation, which may be a methodological limitation. In contrast to previous studies, we could not find any correlation between the withdrawal time and the ADR. This may be due to the self-selection of the physicians who participated in the study as none of them had withdrawal times below 6 min or detected less than 0.6 adenomas per patient. Furthermore, the effect of increasing withdrawal time on the ADR appears to be controversial. The US group that showed the most impressive correlation between withdrawal time and the ADR8 also studied the influence of a quality assurance programme. After implementing their protocol of careful inspection during a minimum of 8 min for withdrawal, they observed significantly greater rates of overall and advanced neoplasia detection during screening colonoscopies,11 but this might have also been due to greater colonoscopist attentiveness. Interestingly, a much larger study that systematically implemented a 7 min withdrawal protocol with an increase in adherence from 65% to 100%12 could not show any increase in the ADR. Therefore, if the true effect of withdrawal time is analysed further, different methodology (eg, measuring the times of all colonoscopies with a stopwatch, ie, halted during biopsy or polypectomy) and large patient numbers should be prerequisites.
To the best of our knowledge, no correlation between physicians' CME activities and their outcome qualities (ADR for colonoscopy) have currently been examined for colonoscopies or any other endoscopic procedures. A recent health technology review that analysed 136 articles and nine systematic reviews concluded that despite the limited overall quality of the literature, CME was effective, at least to some degree, in achieving and maintaining the objectives that were studied, including knowledge (22 of 28 studies), attitudes (22 of 26 studies), skills (12 of 15 studies), practice behaviour (61 of 105 studies) and clinical practice outcomes (14 of 33 studies).24 Only one paper dealing with endoscopy was mentioned; this paper focused on the training that is needed for novices to perform endoscopic procedures.25 The correlation of CME activities with ADR confirms previous evidence reported in a review that showed that almost two-thirds of the interventions in medical education led to an improvement in at least one major outcome measure in the various areas analysed.26 The present study was the first to show that interventions in medical education also appear to lead to improvements in screening colonoscopies. We can only speculate which part of the CME activities may have been responsible because the CME account did not specify which type of meetings were attended. However, larger scientific meetings and meetings with hands-on activities obtain higher CME points in the German system. Overall, a better understanding of polyp morphology and examination techniques, both of which are taught during a variety of meetings, may be correlated with better colonoscopy performance.
We believe that the correlation between the generation of the instrument and the ADR deserves some comments. Most studies that have analysed new imaging features, such as narrow band imaging, only compared the new feature within the same generation instrument using the same high quality scopes; these studies did not show any differences in adenoma detection rate.7 ,27–29 A recent meta-analysis that compared high-definition instruments with standard instruments from one generation earlier only found marginal differences in the adenoma detection rates.30 Among the five studies that were included in the meta-analysis, only two were randomised, and they did not show differences between the two subsequent generations of instruments.31 ,32 A small randomised study that compared two subsequent generations of instruments, however, found a large increase in the adenoma detection rate with newer scopes.33 In addition, a large comparative study that observed the same examiners using different generation scopes over two different time periods found a 15–20% increase in the adenoma detection rate by uniform use of the newest scopes as compared to older equipment from different generations.33 Thus, changes in several colonoscope features—for example, to be found when instruments are compared from the latest generations and those from two generations before—appear necessary before an effect on the adenoma detection rate can be observed in a larger study as with ours.
The present study had several limitations, such as the moderate number of participating colonoscopists (n=21) and their self-selection (ie, only colonoscopists with substantial experience agreed to participate in the study). In addition, we were not able to determine the generation of the instrument that was used on each patient; thus, we could not analyse the instrument effect in the practices that used instruments from multiple generations (that is, mixed generation instruments (n=8/21)). Nevertheless, we believe that the large number of examinations in the present study may counterbalance these limitations.
In conclusion, case volume cannot be taken as a quality parameter per se for screening colonoscopies above an annual colonoscopy number of 200. In countries with a quality programme that defines the minimal case load, case volume does not appear to be a parameter that patients have to worry about with respect to SC quality, and case volume is not suitable for benchmark comparisons. Both individual colonoscopist factors and instrument quality play a greater role than case volume in SC quality. Further studies are needed to show which measures are helpful in improving the quality of screening colonoscopies.
Acknowledgments
We are very grateful to Prof. K Selbmann, Tübingen, for his invaluable advice in preparing the paper.
References
Footnotes
↵* RD Deceased October 2011.
-
Funding The Berlin Colonoscopy Screening Study was supported by a grant from Deutsche Krebshilfe e.V. (no. 108166) and by unrestricted grants from Olympus, Pentax and Falk Companies. There was no influence on study design, performance, data analysis or the writing of the manuscript.
-
Competing interests None.
-
Ethics approval Charité Ethical Committee (nr. EA 02/019/07).
-
Provenance and peer review Not commissioned; externally peer reviewed.