Health Informatics Collaborative (HIC) for Hearing Health

Creating a health informatics data resource for hearing health research

The aim of the Hearing theme is to make better use of patient health data captured in NHS records by collecting data in a standardised and integrated manner across UK sites and setting up a structure for sharing and analysing these data anonymously. These new assemblies of data allow the growth of more effective management strategies for individuals with hearing loss. The Hearing theme has developed the infrastructure and governance for the management and integration of this hearing health data. This allows for:
- Exploration of the effects of known and novel risk factors such as disease clustering for hearing loss.
- Identification of the genetic causes of hearing loss.
- Definition of hearing loss sub-types.
- Identification of candidates who would benefit from upcoming clinical trials.
- Optimisation of patient benefit from individualised treatment strategies.
Please contact any of following people for more information:
- Mr Nish Mehta MBBS, PhD, FRCS (ORL-HNS)
  nishchaymehta@nhs.net
- Baptiste Briot-Ribeyre
  b.briotribeyre@nhs.net

Number of free-text extracted:

Clinical letters: 374,501
Clinical notes: 1,613,050
Imaging reports: 204,412
Histopathology reports: 811

Number of structured data items extracted:

Patients:
- Total (Auditbase patients): 100,775
- Identified in Epic: 61,237
Audiometric tests: 346,872
Hearing devices: 92,662
Drugs extracted: 2,031,834

Number of images extracted:

~10,000 images extracted

View Our Synthetic Data:

The information you see here is synthetic data. It’s not real data, does not contain real patient details, and cannot be traced back to any real patients. It is only designed to look like real health records.

We created these data to show the kind of information used in a research project called NIHR Health Informatics Collaborative - Hearing Health Database. This national initiative aims to transform UK hearing research and hearing care using routinely collected clinical data. The project is led by Nishchay Mehta, a Consultant ENT Surgeon, Otologist (super-specialised ear surgeon), and Auditory Implant surgeon (surgically implanting technological solutions when hearing aids are no longer sufficient). He specialises in the medical and surgical management of children’s and adults’ ENT problems.

Because this data is randomly generated using a tool called datafaker, some parts may not make sense — for example, a birth date might appear after a death date. That’s because the columns are made separately and don’t always link together in a realistic way.

Please note that this data is part of our preliminary version of synthetic datasets. We’re actively improving our process so that over time, more datasets will be available, and the data will look more and more like real-world data, without ever containing any real patient details.

This dataset is only for demonstration and learning purposes. Any similarity to real people is purely coincidental.

How to browse our synthetic data:

1) In the embedded table above, click the ‘view’ button next to the file you’d like to look at.

2) A new window will open up to Figshare, where the file is stored. You will see a collection of tiles containing the file folder on the top half of the page, and a project description on the bottom half of the page.

3) To view the data in your web browser, click the ‘eye’ icon on your desired file tile.

4) The tabular data will display in your browser. You can expand the screen as needed using the double headed arrow ‘full screen’ icon in the bottom right corner of the table.

5) To download the data, click the ‘download file’ icon on your desired file tile.

6) The files are in CSV format, which is like a simple version of an Excel spreadsheet.

Tip: Each row in the file is a ‘record’ (like a line in a spreadsheet), and each column is a type of information (like date, condition, or measurement).

Health Informatics Collaborative (HIC) for Hearing Health

Summary

Team

Number of free-text extracted:

Number of structured data items extracted:

Number of images extracted:

View Our Synthetic Data:

How to browse our synthetic data:

DataTools4Heart