DeepFake electrocardiograms using generative adversarial networks are the beginning of the end for privacy issues in medicine
Thambawita, Vajira; Isaksen, Jonas L.; Hicks, Steven A.; Ghouse, Jonas; Ahlberg, Gustav; Linneberg, Allan; Grarup, Niels; Ellervik, Christina; Olesen, Morten Salling; Hansen, Torben; Graff, Claus; Holstein-Rathlou, Niels-Henrik; Strümke, Inga; Hammer, Hugo L.; Maleckar, Mary M.; Halvorsen, Pål; Riegler, Michael A.; Kanters, Jørgen K.
Peer reviewed, Journal article
MetadataShow full item record
Original versionScientific Reports. 2021, 11 (1), . https://doi.org/10.1038/s41598-021-01295-2
Recent global developments underscore the prominent role big data have in modern medical science. But privacy issues constitute a prevalent problem for collecting and sharing data between researchers. However, synthetic data generated to represent real data carrying similar information and distribution may alleviate the privacy issue. In this study, we present generative adversarial networks (GANs) capable of generating realistic synthetic DeepFake 10-s 12-lead electrocardiograms (ECGs). We have developed and compared two methods, named WaveGAN* and Pulse2Pulse. We trained the GANs with 7,233 real normal ECGs to produce 121,977 DeepFake normal ECGs. By verifying the ECGs using a commercial ECG interpretation program (MUSE 12SL, GE Healthcare), we demonstrate that the Pulse2Pulse GAN was superior to the WaveGAN* to produce realistic ECGs. ECG intervals and amplitudes were similar between the DeepFake and real ECGs. Although these synthetic ECGs mimic the dataset used for creation, the ECGs are not linked to any individuals and may thus be used freely. The synthetic dataset will be available as open access for researchers at OSF.io and the DeepFake generator available at the Python Package Index (PyPI) for generating synthetic ECGs. In conclusion, we were able to generate realistic synthetic ECGs using generative adversarial neural networks on normal ECGs from two population studies, thereby addressing the relevant privacy issues in medical datasets.