Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges
Jha, Debesh; Sharma, Vanshali; Banik, Debapriya; Bhattacharya, Debayan; Roy, Kaushiki; Hicks, Steven; Tomar, Nikhil Kumar; Thambawita, Vajira L B; Krenzer, Adrian; Ji, Ge-Peng; Poudel, Sahadev; Batchkala, George; Alam, Saruar; Ahmed, Awadelrahman M.A.; Trinh, Quoc-Huy; Khan, Zeshan; Nguyen, Tien-Phat; Shrestha, Shruti; Nathan, Sabari; Gwak, Jeonghwan Gwak; Jha, Ritika Kumari; Zhang, Zheyuan; Schlaefer, Alexander; Bhattacharjee, Debotosh; Bhuyan, M.K.; Das, Pradip K.; Fan, Deng-Ping; Parasa, Sravanthi; Ali, Sharib; Riegler, Michael Alexander; Halvorsen, Pål; de Lange, Thomas; Bagci, Ulas
Peer reviewed, Journal article
Published version
Date
2024Metadata
Show full item recordCollections
Original version
10.1016/j.media.2024.103307Abstract
Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Therefore, there is a need for an automated system that can flag missed polyps during the examination and improve patient care. Deep learning has emerged as a promising solution to this challenge as it can assist endoscopists in detecting and classifying overlooked polyps and abnormalities in real time, improving the accuracy of diagnosis and enhancing treatment. In addition to the algorithm’s accuracy, transparency and interpretability are crucial to explaining the whys and hows of the algorithm’s prediction. Further, conclusions based on incorrect decisions may be fatal, especially in medicine. Despite these pitfalls, most algorithms are developed in private data, closed source, or proprietary software, and methods lack reproducibility. Therefore, to promote the development of efficient and transparent methods, we have organized the ‘‘Medico automatic polyp segmentation (Medico 2020)’’ and ‘‘MedAI: Transparency in Medical Image Segmentation (MedAI 2021)" competitions. The Medico 2020 challenge received submissions from 17teams, while the MedAI 2021 challenge also gathered submissions from another 17 distinct teams in the following year. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic. Our analysis revealed that the participants improved dice coefficient metrics from 0.8607 in 2020 to 0.8993 in 2021 despite adding diverse and challenging frames (containing irregular, smaller, sessile, or flat polyps), which are frequently missed during a routine clinical examination. For the instrument segmentation task, the best team obtained a mean Intersection over union metric of 0.9364. For the transparency task, a multi-disciplinary team, including expert gastroenterologists, accessed each submission and evaluated the team based on open-source practices, failure case analysis, ablation studies, usability and understandability of evaluations to gain a deeper understanding of the models’ credibility for clinical deployment. The best team obtained a final transparency score of 21 out of 25. Through the comprehensive analysis of the challenge, we not only highlight the advancements in polyp and surgical instrument segmentation but also encourage subjective evaluation for building more transparent and understandable AI-based colonoscopy systems. Moreover, we discuss the need for multi-center and out-of-distribution testing to address the current limitations of the methods to reduce the cancer burden and improve patient care.