Article Text

Automated sizing of colorectal polyps using computer vision
  1. Mohamed Abdelrahim1,
  2. Hiroyasu Saiga2,
  3. Naoto Maeda2,
  4. Ejaz Hossain1,
  5. Hitoshi Ikeda2,
  6. Pradeep Bhandari1
  1. 1Gastroenterology, Queen Alexandra Hospital, Portsmouth, UK
  2. 2Medical AI Research Depaetment, NEC Corporation, Tokyo, Japan
  1. Correspondence to Pradeep Bhandari, Gastroenterology, Queen Alexandra Hospital, Portsmouth, UK; pradeep.bhandari{at}porthosp.nhs.uk

Statistics from Altmetric.com

Message

Colorectal polyp size is an important biomarker that influences management decisions, but currently used subjective methods are flawed. We explored two computer vision (CV) techniques for binary classification of polyp size as either ≤5 mm or >5 mm. First, we used premeasured phantom polyps (22 such polyps’ videos) fixed on a pig colon model to explore the concept of automated sizing using structure from motion (SfM) approach and compared it with the sizing by 10 independent endoscopists: overall, average diagnostic accuracy of the SfM system (85.2%) was superior to endoscopists judgement (59.5%). Second, we developed a deep learning model based on convolutional neural networks (CNN) and found 80% accuracy in 10 videos of human polyps. Real-time automated polyp sizing when combined with artificial intelligence (AI) assissted polyp characterisation could improve polyp management strategies.

In more details

CV techniques

CV can be defined as the ability of machines to process and understand visual data, automating the type of tasks the human eye would normally be required to do. In order to perform automated polyp size classification, we employed two types of CV techniques, SfM and deep learning (DL).

SfM is a photogrammetric imaging technique that algorithmically recovers three-dimensional (3D) structure of an object from multiple two-dimensional (2D) images and is commonly used in topographic studies. SfM finds matching points in input images and recovers the 3D structure by solving the epipolar constraint equation derived from these matching points, as briefly illustrated in figure 1. The algorithm calculates a camera’s pose as a rotation matrix and a translation vector using matching points. Finally, we apply mathematical formulas to compute the distance between the polyp and endoscope, and that distance is used to compute polyp size in real time. Compared with DL, this SfM technique uses less data making it relatively easier and quicker to convert into a clinical device.

Figure 1

This explains the concept of the structure from motion. We can obtain the relative camera movement using the epipolar constraint equation.

DL, based on neural networks, classifies the input images using a large amount of training data. The quality and accuracy of the training data set are very important in this technique. The biggest challenge here is obtaining accurately sized polyp (in the absence of a validated sizing system) images and videos. We have explored both SfM and DL in this study and developed two separate models. These models were designed to categorise polyps as either category A (≤5 mm) or category B (>5 mm).

Experiments and results

Evaluation of SfM technique

The phantom polyps were made from silicone sealant and precisely cut into different sizes ranging from 1 mm to 10 mm. These artificial polyps were designed to mimic real human polyps in shape and colour and included both sessile and flat morphologies. The polyps were accurately measured, calibrated and validated by two researchers independent of each other. The polyps were fixed on a pig colon model without altering their shape or size. We used a Fujifilm colonoscope to examine the pig colon, and video was recorded for this examination. Recorded videos were reviewed and annotated to label different polyps and their predetermined size, and we used these videos to evaluate SfM-based sizing system. Figure 2 illustrates the experimental setting and environment and online supplemental video 1 shows an example of the video recording outputs used for development and testing of the system.

Figure 2

Images (A) and (B) are examples of the phantom polyps as viewed by the endoscope in the pig colon model. Image (C) shows the pig colon model being scoped. Image (D) shows the real-time endoscopy view during the experiment.

The SfM model was tested on 22 videos of phantom polyps equally divided between the two size categories (≤5 mm and >5 mm). We also asked 10 endoscopists of varying degrees of colonoscopy experience to watch the same 22 videos and categorise polyps as either ≤5 mm or >5 mm. Mean diagnostic accuracy was calculated and compared between the two groups using t-test.

Overall, average diagnostic accuracy of the automated sizing system in the animal model was 85.2%, compared with 59.5% in the endoscopist group (p<0.0001). In category A (≤5 mm), the automated sizing system and endoscopists showed diagnostic accuracy of 81.2% and 66%, respectively. In category B (>5 mm), the automated sizing system accuracy was 87.5%, whereas the endoscopists significantly underestimated the polyp sizes in this category and achieved an accuracy of 42.3%. Table 1 summarises the results of this experiment.

Table 1

Accuracy of automated polyp sizing SfM model and endoscopists in binary classification of colorectal polyps based on their size in an experiment setting (n=22)

Evaluation of CNN DL technique

Here, we used a DL model based on VGG-16 architecture for binary polyp size classification. We used 219 colonoscopy videos containing 301 polyps for training and validation. 80% of the polyp sequences were used for training and the remaining 20% for validation. These polyps were reviewed and sized by three experts and the mean expert’s size estimation was used for training of the AI model. We employed general data augmentation techniques, such as flip, random cropping and colour conversion. Testing was performed on a completely separate data set containing 10 real-time colonoscopy video recordings. These videos were all recorded with forcep-assisted sizing and were reviewed and sized by three experts, and we used the mean expert’s size as the ground truth. The CNN model achieved an accuracy of 80% in classifying polyps as ≤5 mm or >5 mm. Table 2 summarises the diagnostic accuracy of our system in human polyp video recordings, and online supplemental video 2 shows how our sizing system works on real human polyps.

Table 2

Accuracy of an automated polyp sizing CNN model in binary classification of colorectal polyps based on their size (n=10)

Comments

Polyp size is related to the risk of cancer and influences surveillance intervals and therapeutic approaches.1 The 5 mm threshold is particularly important because it influences the implementation of optical diagnosis-based strategies including resect and discard/diagnose and leave approaches and also allows endoscopists to make therapeutic choices of cold versus hot polypectomy.2 However, visual size estimation by endoscopists has a significant interobserver variability and error rate,3 and other methods showed variable and sometimes contradicting results.4 AI is rapidly becoming part of our endoscopy reality, and data on AI-assisted polyp characterisation look very promising, but without accurate sizing, it has limited implication on our practice. However, to date, there is a lack of data on AI-assisted sizing of colorectal polyps, and this report provides one of the very first experiences in this area.

We decided to use artificially created polyps for the early conceptualisation and development of automated polyp sizing systems because human estimation of size, as well as other current methods, are imperfect. This also applies to measuring the size of polyps after removal, especially for small polyps. AI models are only as accurate as the quality and accuracy of their input data, which is why we preferred to come up with this innovative approach to develop a robust AI model.

We developed both SfM-based approach and CNN-based approach separately. We found that SfM-based approach is an appropriate way to algorithmically compute the size of our artificially created polyps in the pig experiment, but the technique proved vey challenging when applied to real-world situation (data not shared here). CNN-based approach worked well for the classification of polyp sizes in real time, but it requires a large amount of high-quality and accurately sized data. Therefore, we have been studying both approaches in parallel and have shared the best results in the tables. Overall, SfM worked well for the pig experiment and CNN for the human colon experiment.

Development of this concept is fraught with challenges. We have just highlighted reasons for not using real human polyps in the early developmental phase due to lack of criteria/measures for accurate sizing. Moreover, using SfM technique on artificial polyps in pig colon makes it challenging for the system to find accurate matching points due to the smooth glistening surface and uniform texture of pig colon. This is less of a problem when working on real human polyp videos given the pit and vascular pattern of human colon, but that will create another challenge from other factors like light reflections. Technical solutions, for example, image preprocessing, could allow for mitigation of these issues. On the other hand, DL approach needs a large amount of data to train the network and more stringent measures are needed to ensure the accuracy of polyp sizing and quality of training data sets (ground truth).

We have proven the feasibility of developing and applying two different CV techniques for automated polyp sizing. In the process, we have identified various challenges and strengths of each of these techniques and this will allow us to develop a final product, which can be used for real-time automated sizing of polyps during colonoscopy.

Ethics statements

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors MA: study concept and design, drafting of the manuscript, analysis and interpretation of data and critical revision of the manuscript. HS: study concept and design, software implementation, analysis and interpretation of data, and revision of the manuscript. NM: study concept and design, software implementation, analysis and interpretation of data, and revision of the manuscript. EH: data acquisition and revision of the manuscript. HI: study concept and design, software implementation, analysis and interpretation of data, and revision of the manuscript. PB: study concept and design, drafting of the manuscript, analysis and interpretation of data, and critical revision of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests MA and EH has no competing interests. PB received a research grant from NEC Japan, Fujifilm, Olympus, Pentax, Boston Scientific, Medtronic and 3-d Matrix.HS, NM, HI are part of the engineering team employed by NEC Japan who collaborated with the remaining authors.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.