Background Biopsies are obtained to confirm intestinal metaplasia and rule out prevalent dysplasia and cancer when Barrett’s oesophagus (BE) is detected at index upper endoscopy (oesophagogastroduodenoscopy [EGD]).
Aim The purpose of this systematic review was to obtain summary estimates of the prevalence of high-grade dysplasia (HGD) and oesophageal adenocarcinoma (EAC) associated with BE during index EGD for chronic GERD symptoms, defined as neoplasia detection rate (NDR) which could be used as a quality measure.
Methods An extensive search was performed within PUBMED, EMBASE and the Cochrane Library databases to identify studies in which patients underwent index endoscopy for the evaluation of the presence of BE. Two reviewers independently evaluated both the study eligibility and methodological quality and data extraction. A random-effects model (REM) based on the binomial distribution was used to calculate the pooled effects of the prevalence of BE-associated dysplasia and EAC.
Results For the calculation of dysplasia and EAC prevalence rates, a total of 11 studies with 10 632 patients met the inclusion criteria including 80.4% men with a mean age of 58.7 years and average BE length of 3.5 cm. The pooled prevalence of EAC, HGD and LGD was 3%(95% CI 2 to 5, 9 studies: 396/10 539 patients), 3%(95% CI 2 to 5 [REM], 9 studies: 388/10 539 patients) and 10%(95% CI 7 to 15 [REM], 10 studies: 907/8945 patients), respectively. For NDR, that is, the pooled prevalence of HGD/EAC was 7%(95% CI 4 to 10 [REM], 10 studies: 795/10 632 patients).
Conclusion NDR is approximately 4% and could be used as a quality measure.
- barrett’s oesophagus
- oesophageal cancer
- barrett’s metaplasia
- gastroesophageal reflux disease
Statistics from Altmetric.com
Significance of this study
What is already known about this subject?
Multiple studies have shown that the rate of prevalent neoplasia associated with Barrett’s oesophagus (BE) is much higher than the incidence of neoplasia among those undergoing surveillance.
Index endoscopy might be the most important endoscopy.
Recent studies have reported the rates of missed lesions/neoplasia after index endoscopy to be up to 20%.
What are the new findings?
The rate of high-grade dysplasia and oesophageal adenocarcinoma (EAC) defined as neoplasia detection rate (NDR) in patients undergoing index endoscopy for screening for BE is about 7%.
This is the first meta-analysis to estimate the NDR at time of index endoscopy for chronic gastro-oesophageal reflux disease symptoms.
How might it impact on clinical practice in the foreseeable future?
NDR could serve as a quality indicator—very similar to adenoma detection rate if validated with the outcome of reduction in the incidence of EAC.
Barrett’s oesophagus (BE) is detected in approximately 10%–15% of patients with gastro-oesophageal reflux disease (GERD) and 1%–2% of the general population.1–4 BE is a precursor to oesophageal adenocarcinoma (EAC), a morbid disease with a 5-year survival rate of 15%–20%.5 BE is thought to sequentially progress through grades of dysplasia from non-dysplastic BE (NDBE) to low-grade dysplasia (LGD), high-grade dysplasia (HGD) and to EAC. Prior studies have noted that most patients with EAC are diagnosed at the initial endoscopy; in other words, the prevalence of EAC is far greater than the incidence.6 7 The presence of prevalent EAC, that is, that detected at the index endoscopy in patients with BE is approximately 5%, whereas the annual risk of progression to cancer (ie, incident dysplasia/EAC) is approximately 0.1%–0.3% per year in patients with non-dysplastic BE.8 9 Dysplasia detected in patients with BE is a risk factor for increased risk of progression to EAC and is the basis to guide management and endoscopic surveillance. Detection of HGD typically warrants therapeutic intervention, whereas LGD entails surveillance every year or endoscopic treatment with ablation of the BE segment compared with surveillance every 3–5 years in NDBE. Therefore, an accurate assessment for the presence of EAC or dysplasia at the time of index diagnosis is crucial.
There are limited data available in regard to the prevalence of HGD/EAC among patients undergoing index endoscopy to screen for BE in the absence of alarm symptoms, that is, true screening population. Furthermore, though prior studies have evaluated the likely prevalent rates of dysplasia and EAC (NDR), several of them have included patients diagnosed within 1–2 years of index endoscopy, that is, a combination of true prevalent and incident cases. Information in regard to NDR will particularly be important in the current era where increasingly healthcare consumers, administrators, payers and policy-makers are paying more attention to healthcare costs and quality (eg, adenoma detection rate [ADR] for colonoscopy). Similarly, NDR might have important clinical implications in terms of quality care in BE and can potentially serve as a quality measure. The importance of NDR was highlighted by a recent meta-analysis by Visrodia et al, wherein it was determined that about 23.9% of EAC and 19% of HGD/EAC ever diagnosed were detected within 1 year of the initial BE diagnosis and hence likely missed diagnoses at the time of index endoscopy.10 Another meta-analysis by Menon and Trudgill concluded that about 11.3% of upper gastrointestinal cancers are missed at endoscopy up to 3 years prior to diagnosis. They also conclude that there was no significant difference between the missed rates of oesophageal and gastric cancer.11
The aim of our study was, therefore, to conduct a systematic review and meta-analysis to determine the BE-related neoplasia detection rate (NDR) in patients undergoing their index endoscopy.
We searched PUBMED, Embase, the Cochrane Central Register of Controlled Trials, abstracts from Digestive Disease Week, American College of Gastroenterology Annual Meeting abstracts and the reference lists of retrieved reports from 1998 through 2018 for studies of prevalence rate of LGD, HGD and EAC in patients undergoing index endoscopy for screening of BE using the following search terms: ‘Barrett or Barrett’s’, ‘esophagus or oesophagus’, ‘dysplasia’ and ‘esophageal adenocarcinoma’. The complete search strategy is provided in figure 1. Methods of analysis and inclusion criteria were based on PRISMA recommendations.12
Inclusion and exclusion criteria
Studies fulfilling the following criteria were included: (i) studies performed in human subjects; (ii) patients aged ≥18 years; (iii) prospective and retrospective studies with a sample size of ≥10 patients; (iv) patients undergoing index endoscopy for symptoms of chronic GERD and no alarm symptoms; (v) studies that documented the prevalence rates of LGD, HGD or EAC. Studies published in languages other than English were excluded. Because we calculated pooled estimates of prevalence rates of LGD/HGD/EAC, studies that did not report data on prevalence rates were not included. Studies that included patients having surveillance endoscopy for a known diagnosis of BE were excluded. Case reports, case series, review articles, letters to editors and conference abstracts with limited information were excluded. When there was an overlap between two studies with the same population, most comprehensive, the recently published study was considered. If there were conflicts in study selection, it was resolved by consensus or by contacting the corresponding author (PS).
Data extraction and definitions
Two investigators (SP and MD) independently screened all titles and abstracts to identify studies that met the inclusion criteria and extracted relevant data using a standardised form. Discrepancies between the two investigators were resolved by discussion and re-examination of the corresponding studies with a senior investigator (PS). All variables of interest were collected on a standardised form, which included the first author of the study, year of publication, type of the study, total sample size, study setting, mean age of the patients, percentage of males, mean length of the segment of BE, number of patients with NDBE, LGD, HGD and EAC. Studies with multiple sequential reports of increasing follow-up durations were considered as one study, with the most recent results utilised for the analysis.
Outcomes and quality assessment
The primary outcome of the pooled analysis was NDR on index endoscopy for BE. NDR was defined as the rate of detection of HGD and EAC in patients having index endoscopy for chronic GERD symptoms and evaluating for the presence of BE. The presence of LGD was not included in the definition of NDR for several reasons: (1) the significant interobserver variation and the wide variation in the accuracy of diagnosing LGD; (2) lack of confirmation of LGD by expert pathologists in some of the reported studies; (3) inconsistency in the reporting of LGD with or without indefinite dysplasia. However, the rates of LGD, HGD and EAC prevalence rates were also recorded separately. Prevalence rates with CIs and P values were included. For studies that did not provide P values, authors were contacted for these data to find out if the data were calculated but not reported.
The quality of each study was assessed by the authors independently using the Newcastle-Ottawa scale.13 Quality was assessed based on the selection of the study group and assessment of outcome.
Data analysis and statistical methods
The primary objective of this systematic review was to assess rates of HGD or EAC detected at the time of index endoscopy, whereas secondary objectives included calculating the pooled estimate rates for LGD, HGD and EAC separately at the time of index endoscopy. Statistical analysis was performed with R V.3.2.2. For all analyses, we used the meta package.4 Data were pooled using a random-effects model (REM) based on binomial distribution. Heterogeneity between the studies was analysed using the inconsistency index (I2) statistics.14 For the I2 statistic, heterogeneity was defined as low (25%–50%), moderate (50%–75%) or high (>75%).15 Publication bias was assessed using funnel plots and asymmetry of the funnel was evaluated with the Egger regression test16. Forest plots of all outcomes were created using REM as shown in figures 2–6. A p value of <0.05 was considered statistically significant. NDR was also separately reported for both US and non-US studies.
We identified a total of 5715 studies, of which 764 studies were included for abstract review. After applying our inclusion and exclusion criteria, the search was narrowed to 68 studies, which were reviewed in full detail. The flowchart of the article search and selection process is demonstrated in figure 1 and online supplementary table 1. After a careful review of these 68 studies, 11 studies were included in the final analysis. We were unable to identify any randomised controlled trials that met the inclusion criteria for the systematic review. All studies were published between 1998 and 2018. Characteristics of the selected studies are detailed in table 1 and online supplementary table 2. Overall, the 11 studies included 10 632 patients with 80.4% men and mean age was 58.7 years. The average BE length was reported in 4 of the 11 studies and was calculated as 3.5 cm.
Supplementary file 1
Supplementary file 2
Analysis of primary outcomes
The primary outcome NDR, including HGD and EAC, was calculated as the pooled prevalence rate of HGD/EAC. The pooled prevalence of HGD/EAC was 7% (95% CI 4% to 10%, p<0.001, I2=96% and Egger’s test for bias, p=0.193). For this analysis, 795/10 632 patients had HGD/EAC and it was calculated from 10 studies (figure 2). For the secondary outcomes, pooled prevalence rates of LGD, HGD and EAC were calculated separately. The pooled prevalence of LGD was 10% (95% CI 7% to 15%, p<0.0001, I2=96% and Egger’s test for bias, p=0.663). Of the 8945 patients, 907 had LGD and it was calculated from 10 studies (figure 3). The pooled prevalence of HGD was 3% (95% CI 2% to 5%, I2=94% and Egger’s test for bias, p=0.086). Of the 10 539, 388 patients had HGD and it was calculated from 10 studies (figure 4). Finally, the pooled prevalence of EAC was 3% (95% CI 2% to 5%, p<0.0001, I2=92% and Egger’s test for bias, p=0.139). Of the 10 539 patients, 396 patients had EAC and it was calculated from 10 studies (figure 5).
The pooled estimates of dysplasia which included LGD, HGD and EAC was 20% (95% CI 15% to 27%, p<0.0001, I2=97% and Egger’s test of bias: p=0.83) calculated from 1637 patients out of the 8642 patients included from studies on whom LGD/HGD/CA was reported (figure 6).
To explore the causes of heterogeneity between studies, separate analysis were performed removing each study from the final pooled analysis and evaluating heterogeneity. No significant difference was noted in the heterogeneity even after excluding potential ‘outliers’ (see the online supplementary table 3). A sensitivity analysis based on the quality of the studies was performed and did not show a significant change in heterogeneity based on the quality of studies which were evaluated by New Castle-Ottawa score >5 and ≤5 (online supplementary table 4).
Supplementary file 3
Supplementary file 4
NDR in US and non-US countries
Pooled estimates of NDR were calculated for US studies and non-US studies and it is noted that NDR was higher in US studies compared with non-US studies. The pooled estimates of NDR were 11% (7%–16%), LGD 16% (10%–25%), HGD 5%(3%–8%), CA 6%(4%–10%) in US studies compared with NDR 5% (3%–9%), LGD 7% (4%–11%), HGD 2% (1%–5%) and CA 2% (1%–4%) in non-US studies. These data are presented in the online supplementary table 5.
Supplementary file 5
Publication bias was assessed using funnel plots and asymmetry of the funnel plot was evaluated with the Egger regression test. Funnel plot and Egger test suggested no evidence of publication bias.
A sensitivity analysis based on the time of publication of the study classified as before and after 2011 was done as AGA guidelines17 regarding the screening for BE was published in 2011. No significant difference in the rates of NDR based on the year of publication was noted except for LGD (online supplementary table 6).
Supplementary file 6
This is the first attempt at measuring NDRs in a systematic fashion, among patients undergoing average risk index screening endoscopy for the presence/absence of BE. In this study, we defined NDR both in the US and non-US population as the rate of detection of HGD/EAC in patients having their screening endoscopy to evaluate for BE. No previous systematic review or meta-analysis has examined the pooled prevalence estimates of NDR. Our analysis included a few population studies and the total sample size was over 10 000 patients. Patients included in our meta-analysis had characteristics that are representative of the BE population.
The results of our study show that in patients with BE at index endoscopy, the NDR was 7%. Therefore, prevalent cases of HGD/EAC can be expected in approximately 1 in 10–15 oesophagogastroduodenoscopies (EGDs) when BE is diagnosed at screening. Given the relatively low rates of progression of NDBE to advanced neoplasia, this high rate of prevalent cases on the index examination means that this examination is the most important examination the patient will have for preventing death from neoplasia, with a yield several orders of magnitude higher than subsequent surveillance examinations.
A consensus estimate of the prevalence of advanced neoplasia at the index BE examination (such as NDR) might also have utility as a quality indicator. Centers for Medicare and Medicaid Services and private payers will likely continue to demand documentation of high-value care in its transition away from the existing fee for service payment model towards a value-based payment system. In order to document high-value care, quality measures will need to be developed and threshold performance values calculated. As it stands today, there are no quality measures for BE that have been adopted by payers for use in value-based payment models despite the growing need. One can imagine that NDR could serve as an objective, outcome-based, easily measurable metric to help us understand if the endoscopist is performing a high-quality EGD. NDR is an outcome measure that might be associated with process measures such as examining the oesophagus adequately, obtaining an adequate number of biopsies and other elements of the screening examination. Presumably, endoscopists performing the high-quality examination in patients with BE would be expected to find a higher proportion of patients with advanced neoplasia, similar to the concept of ADR during colonoscopy. Also, similar to ADR’s association with death from colorectal cancer, further research in this area is needed to demonstrate that a higher NDR on the index examination is associated with reduced mortality from oesophageal cancer. Further prospective studies are needed to confirm the current findings of NDR. Moreover, to suggest that NDR could be potential quality measure, future studies linking the association of NDR with interval/missed HGD/EAC should be done. If NDR is validated, then this metric can be applied to screening endoscopies, but not for a surveillance population.
Comparison of the pooled NDR between US and non-US studies showed considerable variation. This could be secondary to various reasons including the possibly different characteristics of the patients being screened for BE. They may also reflect differences in screening programme organisation, that is, factors related to the healthcare delivery system, such as the extent to which screening is provided within the context of an organised programme or practice environment as observed in other cancer screening/detection rates. Further studies need to be done to understand these variations in NDRs.
There are several limitations to our systematic review. First, there was significant heterogeneity for some of the major results. Although we used a random-effects model, there was still some influence on the final results. Unfortunately, given the paucity of data in regard to patient characteristics in most of the studies, causes of heterogeneity based on patient characteristics including demographics, length of Barrett’s segment, proton pump inhibitor use, smoking and non-steroidal anti-inflammatory drug use could not be tested. However, we tested for the heterogeneity based on the quality of the studies. All of the studies were either from tertiary centres or from multicentre studies.
Second, the quality assessment showed that not all included studies were of high quality, which might lead to some bias in the final statistical results. In order to address this issue, we performed a sensitivity analysis based on the quality of studies included in this meta-analysis, which did not show the difference in the results (online supplementary table 4). Third, the number of biopsies performed during index endoscopies and type of endoscopes used were not reported in all studies, which could potentially lead to low adherence rate and missed dysplasia, ultimately leading to decreased NDR. As with most procedures, these results are open to influence by interobserver variability and sampling error. It is quite feasible that a study might report low NDR primarily due to a systematic under-appreciation of neoplasia by the pathologist, despite high-quality examination by the endoscopist. Lastly, due to the extended time period between the earliest and most recent studies included in our systematic review, from 1998 to 2018, practice guidelines for screening have considerably changed. Moreover, the type and image resolution of the endoscopes has evolved considerably which could potentially affect the calculation of NDR. We performed a sensitivity analysis to evaluate for some changes in the practice of screening for BE based on the AGA guidelines,17 which were published in 2011. However, no significant difference was noted.
In conclusion, our systematic review suggests that NDR (HGD/cancer) in patients undergoing average risk initial screening upper endoscopy for BE is approximately 7% with a range of 4%–10%. Given that the development of dysplasia/cancer in patients with BE (ie, incident cases) undergoing surveillance is low, the index endoscopy might be the most important endoscopy for patients with BE. While we believe that NDR is a viable potential quality metric, we currently lack data to know exactly where the bar should be set and hence we recommend a conservative estimate for NDR to be the lower limit of our study findings of 4% until further studies are done to validate NDR as a quality metric. Further research is needed providing data to include the performance of this metric in the community (vs tertiary) settings, as well as the association of this metric with outcome measures, such as high NDR leading to reduction in the incidence of cancer and cancer-related death.
Contributors SP: concept and design, interpretation of results and drafting the manuscript. MD: data collection and critical revision of the manuscript for intellectual content. AV: critical revision of the manuscript for intellectual content. AP: data collection and critical revision of the manuscript for intellectual content. VTC: data collection. KFK: statistical analysis. NG: critical revision of the manuscript for intellectual content. NJS: critical revision of the manuscript for intellectual content. PS: concept and design, interpretation of the results and drafting manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
No funding for this project
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Patient consent for publication Not required.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.