Health data is one of the NHS’s most powerful assets. Hidden in millions of patient records are insights that could help spot diseases earlier, design better treatments, and make healthcare more personal. But today, the journey from data to discovery is often slow. Researchers wait months for approvals, systems are tightly locked down, and few people see how their information makes a difference.
It’s time to change that.
A Smarter Way to Share: Progressive Data Layers
We’re building a new approach that combines innovation with trust — a model that lets the NHS share useful, realistic data without putting anyone’s privacy at risk.
What Is Publicly-Sourced Synthetic Data?
It’s not real patient data. Our synthetic data is completely made up and never linked to real people. The Data Matryoshka project is a DARE-UK funded initiative to create synthetic data from electronic health data held by UCLH in a safe, trustworthy and reproducible pipeline.
How does it work?
We start with the structure of a real dataset
Then we fill it using publicly available or approved population-level health stats
The result: realistic-looking computer-generated data that protects privacy 100%
Want to learn more?
An Overview of Synthetic Data at UCLH
HDR UK: Intro to Synthetic Data
Reimagining How We Use NHS Data - Safely, Responsibly, and at Speed
We are passionate about sharing our policies and best practices with other institutions. Please get in touch at uclh.safehr@nhs.net for more information.
Synthetic Data
Interested in data?
Find out how to request data for research here.

