Generative Properties of Image to Image Translation Methods
Abstract
Generative Adversarial Networks (GANs) have emerged as a powerful tool in image- to-image translation tasks while preserving essential characteristics and in learning mappings between two image domains without the need for paired data. This thesis investigates the generative properties of GAN-based models, specifically CycleGAN, focusing on their application to gastrointestinal endoscopy images using the Hyper Kvasir dataset. The primary objective is to evaluate the performance of CycleGAN, in generating translations with diverse and realistic properties, such as varying polyp positions, sizes, and shapes. The thesis begins by outlining the significance of image augmentation in enhancing AI models while minimizing data collection costs.
Image-to-Image translation, a subset of computer vision tasks, involves learning mappings between different domains without paired data. Traditional methods relying on paired data has limitations, which motivates the exploration of CycleGAN, and addresses the issue of using unpaired data. Through analysis of generative properties, including distribution of polyp locations and sizes in generated images, CycleGAN demonstrated proficiency in producing accurate translations between healthy and cancerous images with polyps.
This research successfully generated synthetic image data by introducing variations within the original dataset, thereby enhancing the diversity of training data and improving model performance. Visual and qualitative evaluation metrics, along with quantitative analyses such as SSIM, PSNR, and FID, provided comprehensive insights into model performance. Through meticulous analysis of generative properties, the distribution of polyp locations and sizes within generated images is studied, showcasing the model’s ability to generate realistic variations. Distribution analysis of polyps offers insights into the accuracy of translations, confirming the model’s ability to generate varied and precise outputs similar to real medical images.