Statistics from Altmetric.com
We read with interest the article by van der Sommen et al,1 and assert some important and relevant points of adopting artificial intelligence (AI)-assisted endoscopy in clinical practice. Computer-aided diagnosis systems have successfully applied to all segments of the GI tract, even the diagnosis of dysplasia in Barrett’s oesophagus which is the bane of expert endoscopists.2 Recently, colonoscopy with real-time computer-aided detection (CADe) systems achieved higher polyp detection as compared with the performance of expert endoscopists.3 However, in adopting CADe to conventional oesophagogastroduodenoscopy (OGD), related discussions for improving the detection of hard-to-find gastric cancers (GC) are inevitable.
The stomach has a wider, bent lumen, implicating more laborious gastric observations without a blind spot, compared with other GI tract anatomical features such as the oesophagus and colon. In routine OGD, the endoscopists have to distinguish gastric neoplasms from surrounding gastritis mucosa at more distant view, as opposed to detection of colorectal neoplasms and Barrett-related dysplasia by near-view images. In addition, early GCs usually show subtle elevation or depression and their irregular appearances easily hide in the coarse background gastritis caused by Helicobacter pylori infection. Therefore, it is sometimes difficult even for experts to discover early GCs, particularly smaller sizes. Such difficulties in early GC detection might lead to missed GCs in surveillance OGD, and resulting variations in GC detection rates among endoscopists.4 5 Indeed, our data from 17 156 screening OGDs performed by 11 endoscopists from 2017 to 2018 indicate that detection rate averages for total early GCs and minute GCs (ie, size below 5 mm in major diameter) are 0.73% (range 0.17–2.63) and 0.19% (range 0.00–0.89), respectively. These data suggest the current variations in GC detection among endoscopists, and especially for detection of minute GCs, even though the average GC detection rate in our faculty was as high as the national standard of Japan. This interobserver variability may be due to the fact that screening OGD is performed by wide ranges of clinicians, from inexperienced trainees to skilful experts who can frequently detect such minute GCs. To address such interobserver variations, Hirasawa et al have reported machine learning for upper endoscopy focused on GC detection. The CADe system using deep learning produces an excellent sensitivity for GC detection of 92.2% in total, but 16.7% for minute GCs.6 It was annotated that this research had some limitations for image inclusion criteria: training and validation image data sets of the AI algorithm included images with image-enhanced endoscopy (IEE) such as narrow band imaging (NBI) and indigo carmine dye contrast method, though high-quality white light images (WLI) were only comprised as test data sets. As current AI approaches are likely incompatible with reliably detecting minute GCs, we selected five cases in which minute GCs laboriously found by the highest detector in our faculty, and associated images from the screening OGDs combined with IEE were validated by our AI systems. Interestingly, AI failed to detect all of these minute GCs in WLIs (figure 1A). Nonetheless, three of five lesions were detectable at least in one non-magnified IEE view (figure 1B,C) despite using multiple GC-focused images. In real OGD procedures, even experts frequently fail to notice hard-to-find lesions at first white light screening, but eventually encounter such lesions by IEE using NBI (figure 1B),7 or with contrast method by indigo carmine spraying (figure 1C).8
Hence, we propose that effective new AI methods should identify hard-to-find GCs like minute GCs using IEE in addition to WLIs, in accordance with the expert clinical strategy for GC detection. Feasible AI application for IEE,9 10 and collaborations with skilful endoscopists will be key factors to revolutionise the entry and widespread clinical acceptance of routine AI-assisted OGD.
The authors thank Yusuke Kato and other engineers at AI Medical Service (Tokyo, Japan) for their cooperation in developing the convolutional neural networks.
Contributors Study conception and design: DM and MY. Acquisition of data: DM. Analysis and interpretation of data: DM, YA and TT. Drafting of the article: DM and MY. Critical revision of the article for important intellectual content: MY, YA and TT. Final approval of the article: TT.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests TT is a shareholder of AI Medical Service and YA received the lecture fee from Takeda Pharmaceutical.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Patient consent for publication Not required.
Ethics approval We certify that the procedures were performed in accordance with the ethical standards of the responsible committee on human experimentation (Institutional Review Board) and the modified Declaration of Helsinki in 2000, as well as national law. Informed consent or substitute for it was obtained from all patients for being included in the study.
Provenance and peer review Not commissioned; internally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.