DataTools4Heart: A European Health Data Toolbox for Enhancing Cardiology Data Interoperability, Reusability and Privacy

DataTools4Heart

DataTools4Heart: A European Health Data Toolbox for Enhancing Cardiology Data Interoperability, Reusability and Privacy

Cardiovascular disease (CVD) remains the main cause of mortality worldwide, accounting for about a third of annual deaths. Re-use of both structured and unstructured data has the potential for major health benefits for the population suffering from CVD.

Healthcare data re-use in Europe faces privacy and fragmentation issues, a high diversity in data formats and languages, and a lack of technical and clinical interoperability. DataTools4Heart (DT4H) will tackle such challenges and develop a comprehensive, federated, privacy-preserving cardiology data toolbox. This will include, in an integrated platform, standardised data ingestion and harmonisation tools providing a common data model, multilingual natural language processing, federated machine learning, differentially private data synthesis generation, and 7 language models adapted to the cardiology domain. DT4H virtual assistants will help scientists and clinicians navigate through large-scale multi-source cardiology data.

These tools will be:

  • implemented ensuring privacy-by-design and thorough compliance with European regulations and data standards;
  • optimised as based on multi-stakeholder user-centered requirements; and,
  • validated in 7 clinical sites across Europe.

DT4H will unlock currently inaccessible health data in unstructured data and allow multi-site federated data use. Together with its toolbox, DT4H will leave the legacy of a federated learning platform with an embedded metadata catalogue and AI virtual assistants, and the CardioSynth open database of synthetic data remaining as available for further research and AI experimentation. Effective use of the federated learning platform will improve enable improved AI diagnostic and treatment tools. Deployment of regulated solutions will extend existing healthcare management paradigms to reduce disease burden. Finally, DT4H tools, systems and methodology are highly generalised and will translate well to other clinical and research areas in medicine.