SinGAN-Seg: Synthetic training data generation for medical image segmentation

Thambawita, Vajira L B; Salehi, Pegah; Sheshkal, Sajad Amouei; Hicks, Steven; Hammer, Hugo Lewi; Parasa, Sravanthi; de Lange, Thomas; Halvorsen, Pål; Riegler, Michael

dc.contributor.author	Thambawita, Vajira L B
dc.contributor.author	Salehi, Pegah
dc.contributor.author	Sheshkal, Sajad Amouei
dc.contributor.author	Hicks, Steven
dc.contributor.author	Hammer, Hugo Lewi
dc.contributor.author	Parasa, Sravanthi
dc.contributor.author	de Lange, Thomas
dc.contributor.author	Halvorsen, Pål
dc.contributor.author	Riegler, Michael
dc.date.accessioned	2022-12-22T14:36:56Z
dc.date.available	2022-12-22T14:36:56Z
dc.date.created	2022-10-10T13:02:31Z
dc.date.issued	2022-05-02
dc.identifier.citation	PLOS ONE. 2022, 17 (5), 1-24.	en_US
dc.identifier.issn	1932-6203
dc.identifier.uri	https://hdl.handle.net/11250/3039279
dc.description.abstract	Analyzing medical data to find abnormalities is a time-consuming and costly task, particularly for rare abnormalities, requiring tremendous efforts from medical experts. Therefore, artificial intelligence has become a popular tool for the automatic processing of medical data, acting as a supportive tool for doctors. However, the machine learning models used to build these tools are highly dependent on the data used to train them. Large amounts of data can be difficult to obtain in medicine due to privacy reasons, expensive and time-consuming annotations, and a general lack of data samples for infrequent lesions. In this study, we present a novel synthetic data generation pipeline, called SinGAN-Seg, to produce synthetic medical images with corresponding masks using a single training image. Our method is different from the traditional generative adversarial networks (GANs) because our model needs only a single image and the corresponding ground truth to train. We also show that the synthetic data generation pipeline can be used to produce alternative artificial segmentation datasets with corresponding ground truth masks when real datasets are not allowed to share. The pipeline is evaluated using qualitative and quantitative comparisons between real data and synthetic data to show that the style transfer technique used in our pipeline significantly improves the quality of the generated data and our method is better than other state-of-the-art GANs to prepare synthetic images when the size of training datasets are limited. By training UNet++ using both real data and the synthetic data generated from the SinGAN-Seg pipeline, we show that the models trained on synthetic data have very close performances to those trained on real data when both datasets have a considerable amount of training data. In contrast, we show that synthetic data generated from the SinGAN-Seg pipeline improves the performance of segmentation models when training datasets do not have a considerable amount of data. All experiments were performed using an open dataset and the code is publicly available on GitHub.	en_US
dc.language.iso	eng	en_US
dc.publisher	Public Library of Science	en_US
dc.relation.ispartofseries	PLOS ONE;17(5): e0267976
dc.rights	Navngivelse 4.0 Internasjonal	*
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/deed.no	*
dc.subject	Imaging techniques	en_US
dc.title	SinGAN-Seg: Synthetic training data generation for medical image segmentation	en_US
dc.type	Peer reviewed	en_US
dc.type	Journal article	en_US
dc.description.version	publishedVersion	en_US
dc.rights.holder	© 2022 Thambawita et al.	en_US
dc.source.articlenumber	e0267976	en_US
cristin.ispublished	true
cristin.fulltext	original
cristin.qualitycode	1
dc.identifier.doi	https://doi.org/10.1371/journal.pone.0267976
dc.identifier.cristin	2060056
dc.source.journal	PLOS ONE	en_US
dc.source.volume	17	en_US
dc.source.issue	5	en_US
dc.source.pagenumber	1-24	en_US

Tilhørende fil(er)

Filnavn:: journal.pone.0267976.pdf
Størrelse:: 2.655Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Publikasjoner fra Cristin [3269]
TKD - Institutt for informasjonsteknologi [945]
TKD - Department of Computer Science

Vis enkel innførsel

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse 4.0 Internasjonal