To build Artificial Intelligence (AI) algorithms that perform adequately and can facilitate progress in all fields of clinical research, large, high-quality datasets are needed. In this context, one of the biggest challenges for researchers is to have access to electrocardiographic (ECG) signals, especially in the case of rare diseases.
This is the case of the ongoing European project Cardio-MIPA Inherited Arrhythmogenic Diseases Monitoring, Identification, Prediction and Alert (CMIPA), which involves the collaboration of the Biomedical Signal Processing (BPS) research group, led by Francesca Faraci, of the Institute of Digital Technologies for Personalized Healthcare (MeDiTech), with the , the , the Dalle Molle Institute for Artificial Intelligence (IDSIA USI-精东影业), and the companies WellD (CH) and L. I.F.E (IT).
In this project, the need to focus on rare heart diseases and the resulting lack of data availability gave Giuliana Monachino and Beatrice Zanchi, Ph.D. students at MeDiTech, the opportunity to explore the topic of synthetic data generation with AI techniques.
The work identifies and examines the causes of this lack of data, stemming from the challenges faced during the dataset creation and sharing processes (e.g., limited population access, strict rules for data sharing, the presence of identifying factors in the ECG that make it similar to a fingerprint, etc.).
Following this, the main characteristics of DGMs are analyzed and the potential and limitations of their application in this field are investigated. These algorithms have been shown to be able, not only to generate large amounts of ECG signals, but also to facilitate data anonymization processes in order to simplify data sharing while respecting patients' privacy. The application of such algorithms could foster research progress and cooperation in the name of open science.