Background As screening colonoscopy becomes more widespread, the costs for histopathological assessment of resected polyps are rising correspondingly. Reference centres have published highly accurate results for endoscopic polyp classification. Therefore, it has been proposed that, for smaller polyps, the differential diagnosis that guides follow-up recommendations could be based on endoscopy alone.
Objective The aim was to prospectively assess whether the high accuracy for endoscopic polyp diagnosis as reported by reference centres can be reproduced in routine screening colonoscopy.
Design Ten experienced private practice endoscopists had initial training in pit patterns. Then they assessed all polyps detected during 1069 screening colonoscopies. Patients (46% men; mean age 63 years) were randomly assigned to colonoscopy with conventional or latest generation HDTV instruments. The main outcome measure was diagnostic accuracy of in vivo polyp assessment (adenomatous vs hyperplastic). Secondary outcome measures were differences between endoscopes and reliability of image-based follow-up recommendations; a blinded post hoc analysis of polyp photographs was also performed.
Results 675 polyps were assessed (461 adenomatous, 214 hyperplastic). Accuracy, sensitivity and specificity of in vivo diagnoses were 76.6%, 78.1% and 73.4%; size of adenomas and endoscope withdrawal time significantly influenced accuracy. Image-based recommendations for post-polypectomy surveillance were correct in only 69.5% of cases. Post hoc analysis of polyp photographs did not improve accuracy.
Conclusions In everyday practice, endoscopic classification of polyp type is not accurate enough to abandon histopathological assessment and use of latest generation colonoscopes does not improve this. Image-based surveillance recommendations after polypectomy would consequently not meet guideline requirements.
- Colorectal Adenomas
- Colonic Polyps
Statistics from Altmetric.com
Significance of this study
What is already known about this subject?
Histopathological examination of polyps removed during screening colonoscopy substantially adds to costs.
Reference centres have reported excellent results for endoscopic classification of polyps, especially with newer generation endoscopes.
It has been proposed that, for smaller polyps, follow-up recommendations could be based on endoscopic differential diagnosis only, with no histopathological evaluation.
What are the new findings?
Endoscopic in vivo assessment of colon polyps by experienced private practice gastroenterologists was not sufficiently accurate to replace histopathological evaluation; this was not improved with the use of more advanced instruments.
Recommendations based on endoscopic imaging would have been incorrect in a third of cases.
How might it impact on clinical practice in the foreseeable future?
Histological examination is still necessary to guide follow-up after polypectomy.
Development of better techniques for endoscopic analysis of polyp images is needed.
Colonoscopy is regarded as one of the most effective methods for colorectal cancer prevention because it also allows for removal of colonic adenomas as precursor lesions.1 ,2 However, histological analysis of small and possibly less relevant lesions significantly increases workload and costs. Therefore it has been suggested that endoscopic imaging technology might be used for differential diagnosis of these smaller polyps and histology only used for larger lesions,3 As a consequence, follow-up recommendations after polypectomy would be based on imaging-based polyp differentiation. In a recent trial, the DISCARD (Detect Inspect Characterise Resect and Discard) study, the prediction of polyp histology by means of endoscopic pattern analysis was found to have an accuracy of more than 90%.4 Numerous other prospective and retrospective studies have also reported high sensitivity and specificity rates,5–10 including highly accurate image-based follow-up recommendations.11 ,12
Since all these data stem from reference centres with a specific scientific interest in endoscopic imaging, it is not known whether these results can be reproduced in the daily routine of screening colonoscopy where it would be most relevant. A recent in vivo analysis of polyp images indicated that this may not be the case.13 Therefore we performed a large prospective study in a private practice screening setting with polyp differential diagnosis as the main outcome. When analysing patient, physician and endoscope factors influencing accuracy, the latter was tested by randomising patients to a standard and a latest-technology instrument group.
Patients and methods
Patients and study performance
During a 14-month period, consecutive asymptomatic persons undergoing screening colonoscopy were considered for inclusion in this prospective trial. The study was performed in eight private gastroenterology practices with a total of 10 examiners, each with a lifetime experience of at least 10 000 colonoscopies and substantial expertise in study performance.14–17 Ethical approval was given by the Ethical Committee of the Hamburg Chamber of Physicians (PV 3272). All authors had access to the study data and reviewed and approved the final manuscript.
Patients were prescribed polyethylene glycol lavage bowel preparation at all the centres. Examiners cleaned the colon during instrument insertion and withdrawal as much as possible and antispasmodics (butylscopolamine 10–20 mg intravenously) were administered only rarely if required. The examination technique included inspection mainly during withdrawal. Pentax colonoscopes (Pentax Inc, Hamburg, Germany) were used. To analyse the influence of newer instrument technology on accuracy, patients were randomly allocated to the use of either a conventional endoscope (Pentax Classic Line, EC-380FK, FKp and FK2p) or a latest generation endoscope (Pentax HI Line, EC-3890Fi2 using an I-Scan setting, surface-enhancement mode +4/511) with EPK-i processors and HDTV monitors. I-Scan technology for colonoscopy is described elsewhere.10 Randomisation lists were used for allocation to an instrument group at each individual centre.
Parameters and outcomes
The following parameters were recorded:
age and sex of the patient;
type and dosage of sedation;
examination time, for instrument introduction and withdrawal, including biopsy and polypectomy;
colonoscopy completion rate of endoscopists;
polyp characteristics: size (measured by open forceps or snare); shape (pedunculated, sessile or flat18); and location (left side up to splenic flexure, right side);
histological findings after polyp removal, using snare polypectomy or forceps removal (for polyps <3 mm), or biopsy if there were contraindications; histological analysis was done by several specialised gastrointestinal histopathologists as in previous studies of the group.14–17
The main outcome parameter was the accuracy of in vivo differential diagnosis (sensitivity, specificity, predictive values) between adenomatous (neoplastic) and non-adenomatous (hyperplastic) polyp histology every time a polyp was detected; gold standard was histopathological analysis of polyps removed or biopsied. Physicians were encouraged to make a decision in every case as required in clinical practice.
Secondary outcome measures included the following:
Differences between endoscopes (Classic Line vs Hi Line) in the differential diagnosis of polyps. In the new endoscope (Hi Line) group, iScan imaging had to be used for differential diagnosis in all cases.
Accuracy of follow-up recommendations (% correct recommendations/all cases) based on endoscopic image differential diagnosis in all cases and in polyps 1–5 and 6–10 mm.
Analysis of other factors with possible influence on accuracy, related to patient and examiner characteristics.
Differential diagnosis criteria were agreed on before the start of the study and were based on pit pattern analysis.19 Example images were provided from the basic paper19 and by one of the authors (RK) using the same Pentax equipment as in this study.10 Examiners were given a glossary with definitions, schematic drawings and example images for different pit patterns that they could use during study examinations taken with different endoscopes, including the iScan technology used in this study. This glossary could be used throughout the study and also for the post hoc analysis (see below). Each participating examiner had experience in using HDTV/I-Scan technology in about 20–25 examinations before the study commenced.
Furthermore, a blinded post hoc analysis was performed on polyp images to assess accuracy (sensitivity, specificity, predictive values) in analogy to the methodology used in many previous studies.12 Because such an analysis was considered primarily and was part of the protocol, physicians were instructed to take at least one photograph per polyp in a certain manner (about 1 cm distance from polyp, recognisable polyp pattern), described in detail in the glossary mentioned above.
Study procedure and definitions
In vivo differential diagnosis during colonoscopy
The pit pattern classification19 was used as the basis for the differential diagnosis; examiners were instructed to differentiate between hyperplastic polyps (pit patterns I, II) and adenomatous polyps with various grades of neoplasia (pit patterns III–V). No further detailed pit pattern classification was required.
Follow-up recommendations were based on image results and taken from current guidelines:20 10-year follow-up after normal colonoscopy findings without adenomas, that is, no or only hyperplastic polyps; 5–10 years for one or two small (<1 cm) adenomas without villous components; and 3 years for three or more adenomas or at least one advanced adenoma (>1 cm and/or villous components and/or high-grade intraepithelial neoplasia).
Post hoc analysis: blinded image assessment of polyps
A subgroup including 198 polyps was selected from all patients who had only one polyp and for which histological data were available. This was done to reliably exclude mistakes in image allocation in patients with multiple polyps. Of these 198 single-polyp cases a total of 989 images were anonymised, arranged in random order and reviewed by five examiners, namely, three study participants involved in colonoscopies (JA, AA, MM) and two university clinicians (GS, RK) not involved in colonoscopies but specialised in advanced imaging. During post hoc assessment, examiners were again allowed to use the glossary for direct comparison. Only cases with images judged to be of sufficient quality were then further analysed for differential diagnosis, which was based on the same pit pattern algorithm used for the in vivo assessment (see above). If applicable, examiners were also allowed to make a differential diagnosis on other parameters in case of failed pit pattern recognition (size, colour, shape) and this was recorded.
To calculate the required case number, it was assumed that a polyp rate of 0.55 per 1000 screening cases (all polyps/all patients) could be found, based on previous studies with the same group.17–20 About two-thirds of these were adenomas and one-third were hyperplastic polyps. The case number calculation for reaching a difference in accuracy of 75% (conventional endoscope) vs 85% (new generation scope) in the differential diagnosis between adenomas and hyperplastic polyps showed a required number of 260 polyps per group with a power of 80% at a significance level of 0.05. With this total number of polyps and a polyp rate of 55%, a case number of 1050 (allowing 5% dropout) was required.
For two-sample comparisons, t tests were used for metric data and χ2 tests for nominal data (table 1). Rater agreement was determined using Cohen's κ. To examine how accuracy or sensitivity and specificity might depend on size, location, scope, age and gender, a generalised logistic mixed model was applied, with the outcome ‘accurate diagnosis’, the listed covariates as fixed effects, and ‘patient’ as random effect to account for repeated polyps in the same patient. In a second step, to further differentiate whether changes in accuracy are due to changes in sensitivity and specificity, we introduced the factor ‘adenoma’ and its interactions with the other covariates in the model. Only significant interactions were kept in the model. Similar models were applied to analyse the subgroup data, this time with ‘polyp’ as a random effect and ‘size’, ‘shape’, ‘number of images’, ‘use of pit pattern’ and ‘physician’ as fixed effects. The results are presented as forest plots, representing regression coefficients and their 95% confidence limits. All statistical analyses were carried out using SPSS V.19.0 or STATA V.12.0.
Patients and polyps
Table 1 shows details of patients, colonoscopy procedures and polyps detected. In the following, the results for both groups are mostly presented combined.
The caecum was reached in almost all cases (residual faeces or technical problems prevented full inspection of the caecum in three cases). No complications were encountered. A total of 11 carcinomas were found, but these were not counted in the calculation of adenoma rate.
In total, 724 out of 729 polyps detected were available for analysis; in the remaining five cases, no histological data wren gained or documented. A total of 681 of these polyps were either adenomas or hyperplastic polyps, but in six cases, in vivo assessment by examiners was missing. Thus, 675 polyps were left for final data analysis, of which 461 were adenomas (including six sessile serrated adenomas) and 214 hyperplastic polyps. Polyp size was <1 cm for 86.9% of adenomas and 99.5% for hyperplastic polyps. Larger-size polyps (>1 cm) were all adenomas except for one. Polyp location was right or left sided for 38.1% and 61.9% of adenomas and 28.5% and 71.5% of hyperplastic polyps, respectively.
In vivo polyp differential diagnosis
The accuracy results with respect to polyp characteristics are shown in table 2. Overall accuracy was 76.6%; sensitivity (78.1%) and specificity (73.4%) were also only moderate. Intraclass correlation was 0.30 (p<0.001), indicating that the diagnostic quality was more similar in polyps from the same patient than in polyps from different patients. In other words, the diagnostic quality was determined in 30% of cases by the individual patient and in 70% of cases by the individual polyp. The corresponding log-linear model (figure 1) analysing different factors with regards to patients, polyps, examiner and instrument characteristics demonstrates that accuracy did not depend on age, gender, polyp location or examiners’ adenoma detection rate. However, longer withdrawal times had a significant influence on accuracy (p=0.02). In polyps 6–10 mm in size, accuracy was significantly higher than for smaller size polyps (1–5 mm) for adenomatous polyps, not for hyperplastic polyps; the same was true for flat adenomas (not hyperplastic polyps). Accuracy over the study period did not differ significantly between the first half and the second half of cases included by all participating physicians (77.9% vs 74.9%).
Differences between endoscopes
Of the included patients, 530 were randomly assigned to the conventional (Classic Line) group and 539 to the new technology (Hi Line with I-Scan, called I-Scan) group. Adenoma detection rate (rate of patients with at least one adenoma) was not significantly different between these two groups; in addition, the differential diagnostic ability was not different. In detail, sensitivity was higher for Hi Line and specificity higher for Classic Line. Detailed results are shown in table 3. Also, the log-linear model (see figure 1) showed a significant superiority of the newer type instruments (iScan) in the correct diagnosis of adenomas, but not of hyperplastic polyps. Since there was no difference between the groups, the results are mostly presented in combination.
Accuracy of follow-up recommendations
The results for appropriate follow-up recommendations based on polyp in vivo assessment in patients with polyps up to 10 mm in size are shown in table 4. Of 409 patients with colon polyps in the study, 347 were selected as suitable for this analysis; exclusion of the remaining 62 was due to larger polyp size (n=44), missing histological data or missing in vivo assessment (n=18). Incorrect follow-up allocation was found in 30.5% of all patients; no significant differences were found between patient groups with polyps 1–5 mm or 6–10 mm.
Post hoc polyp image analysis
Table 5 shows the results of the five examiners for sensitivity and specificity on the basis of images selected as suitable by each of them, the rate of which was highly variable. The results are again shown for both types of instruments in combination. Since accuracy values for each examiner were only calculated on the basis of the set of polyps they had individually selected as suitable, a direct comparison of the observed values would be substantially biased as they related to individually selected images. We therefore used a statistical model that adjusted for covariates and polyp/image selection. In a multivariate analysis (figure 2), overall accuracy increased with polyp size (p<0.001) and the number of images available (p=0.004). In general, agreement between physicians was low: κ values were 0.45 for all five examiners, 0.55 for the three endoscopists in private practice and 0.53 for the two hospital endoscopists. However, university endoscopists were significantly more accurate in correctly diagnosing adenomas, but significantly inferior in correctly diagnosing hyperplastic polyps (figure 2). Examples of endoscopic polyp images are shown in figure 3.
Polypectomy significantly contributes to the preventive effect of screening colonoscopy by removal of adenomas as precancerous lesions. Between 15% and 60% of screened people harbour such adenomas,17 ,21 mostly smaller, and their histological analysis after removal significantly adds to costs and expenditure of colorectal cancer screening.3 Thus, discarding histological analysis for smaller polyps and replacing histology by endoscopic imaging to arrive at guideline-based follow-up recommendations has been suggested as cost effective.4 A large number of studies, retrospective and prospective, using live assessment and post-hoc image analysis of colon polyps have been published by reference centres with mostly excellent results.12 This led to a position paper by the American Society of Gastrointestinal Endoscopy (ASGE), which set two preconditions for such an approach,12 namely, required accuracy in the prediction of post-polypectomy surveillance intervals of more than 90% and required negative predictive value for adenomatous histology of 90% or more. The recently introduced topic of serrated adenomas and their differential diagnosis in comparison to hyperplastic polyps with still substantial inter-observer variability on histopathological analysis complicates this issue further and does not support a significant role of endoscopic imaging at present because histopathological inter-observer agreement is already rather limited.22 A limitation of our study may therefore be that no uniform histopathology is available.
The results of our study showed that both requirements set out by the ASGE were not fulfilled in an office-based setting. Inadequate allocation to surveillance intervals would have been advocated in almost a third of patients. The overall negative predictive value for intra-procedural endoscopic evaluation of polyps was only 61%. We think our study results are relevant because everyday clinical practice outside of reference centres with a specific interest in imaging colonoscopists will have to bear most of the burden of endoscopic differential diagnosis within a busy schedule of screening colonoscopies. Thus, accuracy values in this setting have the biggest impact on decision making. Our study was the largest dealing with this topic12 and was performed in a uniform setting, only including office-based screening colonoscopies. In addition, we used both methodologies applied in previous studies, namely, live assessment and later (blinded) image analysis, without appreciable differences in accuracy. Finally, we also reported on further methodological details of assessment with regards to case selection and percentage of images assessed, time for assessment etc in the post-hoc analysis, which were generally not previously reported.
In our study we based the assessment of polyps on the pit pattern classification which has been used in the vast majority of previous publications, mostly with excellent results,12 including a recent paper using the same instruments as in our study.10 More recently, a new classification, called NICE classification (NBI International Colorectal Endoscopic classification), was developed which is based on an imaging technology (narrow band imaging) developed by one company. This classification has also been shown to produce excellent results in polyp differential diagnosis.23–25 However, these results could not be reproduced in a community setting.13 Whether dye staining would have helped cannot be concluded from our study since it was not used, but previous results suggested that, at least in the iScan group, the image processing function10—similar to studies using narrow band imaging23–25—would have compensated for the absence of dye staining.
The issue of confidence levels in the differential diagnosis has been mentioned and discussed in the above-mentioned ASGE position paper12 and was dealt with in our study in two different ways. During the in vivo examinations, examiners were obliged to make a decision. This is different than previous studies in which some uncertainty—in contrast to the requirements for a histopathological diagnosis of polyps—is allowed and is usually found in around 20% of cases. It could be debated whether, when imaging is used instead of histology, a definitive diagnosis should be made in every case. Although this issue could be regarded as a limitation of our study, we preferred to use an intention-to-diagnose analysis. However, for post hoc image assessment, only images allowing a confident differential diagnosis were to be selected for analysis, thus allowing some form of confidence level in the post hoc analysis. Although the later assessments had somewhat better results, they were highly variable and still did not reach the accuracy levels required.
Experience and dedication of the examiners may play a crucial role in image-based polyp differential diagnosis. The gastroenterologists participating in our study were a selection of colleagues in private practice with extensive experience in colonoscopy and in clinical research in this area.14–17 A recent in vivo analysis of polyp images concluded that community practice endoscopists fared less well in polyp differential diagnosis than academic endoscopists.13 However, in our study university gastroenterologists did not achieve better results than their private practice colleagues in the post hoc analysis. In detail, they had significantly better sensitivity and significantly worse specificity in the diagnosis of adenomas versus hyperplastic polyps. The dedication of individual endoscopists played a role, as indirectly shown by significantly better results with longer withdrawal times in our study, as is known for adenoma detection rate.26 Withdrawal times were previously demonstrated by our group not to correlate with adenoma detection,17 since they were in a rather narrow range; however, withdrawal time had an influence on differential diagnosis in this study. Therefore, it could have been that with longer times spent for withdrawal (and probably also for assessment of polyps), the results may have been improved.
Learning curves for differential diagnosis of polyps that includes pit pattern have been described in recent papers,27–30 with mostly optimistic results regarding the ease of learning: one paper even stated that 20 min of teaching would be enough.28 Again, this could not be confirmed by our data. In our trial, endoscopists could even use a glossary with sample images when doing their assessments. In addition, in contrast to an earlier study by our group on narrow band imaging for adenoma detection,31 no learning effect during the study could be found, since accuracy did not change. In the post hoc image analysis part of our study, the results did not improve. However, the images taken were found to be insufficient for differential diagnosis to a variable extent.
The effect of image technology on differential diagnosis of colon polyps has been broadly discussed and it has been postulated that newer generation high-definition endoscopes with image-processing functions fare better than conventional scopes.10 ,23–25 For this reason, we randomly allocated patients to colonoscopies with either the latest technology or the previous endoscope generation. In contrast to the previous studies mentioned above, we found only minor overall differences in accuracy, with a significant superiority only for a correct adenoma diagnosis. However, we doubt that this superiority in sensitivity may be clinically relevant enough.
As a consequence of the limited accuracy in differential diagnosis in our study, recommended follow-up intervals were inadequate in a third of cases when based on imaging as opposed to histopathological analysis. Current recommendations include a 10-year interval following a negative colonoscopy;20 that is, for those without adenomas and/or only hyperplastic polyps and three to five intervals in the case of adenomas. The cost-effectiveness implications of incorrect follow-up recommendations after colonoscopies without neoplasia would be even greater if the once-in-a-lifetime colonoscopy concept should prevail.32
Our results do not fully exclude future implementation of a ‘DISCARD’ strategy for (small) colonic polyps. Ways to improve differential diagnosis by imaging may include even more intensive training and familiarisation with the image characteristics of polyps. Even more promising, and probably a more realistic alternative, the subjective element of image assessment could be overcome by automatic or computerised assessment of polyps,33 as is common in CT colonography.34 Time spent and software costs then have to be weighed against the costs of the classic approach of histopathology processing and analysis. Further studies with new technology must take account of all these factors before we can justify changing our routine practice. Presently, histology still appears necessary to decide on further management of patients after colonoscopic polypectomy, even in the case of small polyps.
We thank Ulrich Gauger for initial statistical advice.
Dr Drossel died in October 2011.
Contributors Study planning, organisation, data assessment and paper writing: TR. Data assessment, post hoc image analysis, paper writing: GS. Data analysis: AT, KB and KW. Data collection and monitoring: GS and CB. Performance of study colonoscopies, documentation of results and post hoc image analysis: MM, JA and AA. Performance of study colonoscopies and documentation of results: RD, AS, MS, C-HB, J-PB and WB. Post hoc image analysis: RK and GS. GS and MM contributed equally.
Funding The study was supported by Pentax Co, Hamburg, Germany, who provided the colonoscopic equipment to the participating centres. No further financial support was given.
Competing interests None.
Ethics approval Hamburg Chamber of physicians (PV 3272).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement All authors had access to the study data. The sponsor had no influence on data management and analysis.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.