CAMDA 2025 - ELSA Health Privacy Challenge

This repository is a "starter package" for the Health Privacy Competition that runs within CAMDA Conference 2025. The Health Privacy Challenge is organized in the context of the European Lighthouse on Safe and Secure AI (ELSA, https://elsa-ai.eu).

The Health Privacy Challenge consists of two tracks:

Track I: Featuring Bulk RNA-seq

Track I runs in a “Blue Team (🫐) vs Red Team (🍅)” scheme.

The blue teams develop novel privacy preserving generative methods that can mitigate privacy risks while preserving biological insights for bulk gene expression datasets,
The red teams launch trustworthy and realistic membership inference attacks (MIA) against blue teams’ solutions to assess whether these generative methods can withstand privacy attacks.

Track II: Featuring Single-cell RNA-seq

Track II invites participants to explore the privacy and utility of synthetic single-cell gene expression (scRNA-seq) data. Participants are encouraged to:

investigate and reveal potential privacy risks linked to generating synthetic scRNA-seq datasets.
develop privacy-preserving generative methods that balances data privacy and utility.
propose novel evaluation metrics and strategies to assess both utility and privacy preservation in a multi-sample donor setting.

We are looking forward to engaging with you and working together to deepen our understanding of privacy in healthcare. 🤗

Introduction

This repository contains:

👩‍💻 Baseline code for generative methods (Blue Teams) and Membership Inference Attack algorithms (Red teams).
📝 Documentation that details setup and submission instructions for the competition.
📎 Submission templates to base your submissions on.

Other resources:

💬 CAMDA Health Privacy Challenge Google Groups: Join us for questions, discussions and further announcements.
🌐 CAMDA Challenge website: Follow CAMDA 2025 for conference announcements.
🌐 ELSA Benchmark method submission platform: The platform to register, to download datasets, and to submit your benchmark methods.
📚 Relevant papers: https://arxiv.org/abs/2402.04912

🎢 Get started!

Both teams, please check out Getting Started to set up and use the starter package!

Datasets

Datasets are available for download in ELSA Benchmarks Competition platform after registration and signing the data download agreement.

Track I: Featuring bulk RNA-seq

We re-distribute pre-processed versions of two open-access TCGA RNA-seq datasets, available through the GDC portal:

TCGA-BRCA RNASeq

Dimensions: <1089 x 978> Details: Suitable for cancer subtype prediction (5 subtypes)
TCGA COMBINED RNASeq (with 10 different cancer tissues )

Dimensions: <4323 x 978> Details: Suitable for cancer tissue of origin prediction (10 tissues)

Navigate here for details about the pre-processing steps.

Track II: Featuring single-cell RNA-seq

We re-distribute raw counts of OneK1K single-cell RNA-seq dataset (https://onek1k.org/), a cohort containing 1.26 million peripheral blood mononuclear cells (PBMCs) of 981 donors, generously provided by Joseph Powell and the authors (Yazar et al., 2022) in Garvan Institute of Medical Research.

Train dataset: <633711 cells from 490 donors x 25834 genes >
Test dataset: <634022 cells from 491 donors x 25834 genes >

Navigate Track II homepage for details about the pre-processing steps.

📅 Schedule

👥 Organization Team

This competition is designed as a collaborative effort between European Molecular Biology Laboratory (EMBL), CISPA Helmholtz Center for Information Security, and the University of Helsinki with the support of Barcelona Computer Vision Center (CVC) within the context of ELSA Project.

EMBL: Hakime Öztürk, Julio Saez-Rodriguez and Oliver Stegle
CISPA: Tejumade Afonja, Ruta Binkyte and Mario Fritz
University of Helsinki: Joonas Jälkö and Antti Honkela

and in collaboration with Saez-Rodriguez group in Track II and the review process:

University of Heidelberg: Sebastian Lobentanzer, Pablo R. Mier, Attila Gabor.

We also thank Katharina Mikulik (DKFZ), Kevin Domanegg (DKFZ), and Danai Vaigaki (EMBL) for helpful feedback.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
data		data
experiments		experiments
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dpctgan_environment.yaml		dpctgan_environment.yaml
environment.yaml		environment.yaml
sc_environment.yaml		sc_environment.yaml
timeline.png		timeline.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CAMDA 2025 - ELSA Health Privacy Challenge

Track I: Featuring Bulk RNA-seq

Track II: Featuring Single-cell RNA-seq

Introduction

🎢 Get started!

Datasets

Track I: Featuring bulk RNA-seq

Track II: Featuring single-cell RNA-seq

📅 Schedule

👥 Organization Team

About

Releases

Packages

Contributors 3

Languages

License

PMBio/Health-Privacy-Challenge

Folders and files

Latest commit

History

Repository files navigation

CAMDA 2025 - ELSA Health Privacy Challenge

Track I: Featuring Bulk RNA-seq

Track II: Featuring Single-cell RNA-seq

Introduction

🎢 Get started!

Datasets

Track I: Featuring bulk RNA-seq

Track II: Featuring single-cell RNA-seq

📅 Schedule

👥 Organization Team

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages