Crowd of young men and women walking on future city park using smartphones. Internet social network addiction concept. Millenial influencer group holding mobile gadgets on parkland. Vector illustration

A UK-wide public dialogue exploring what the public perceive as ‘public good’ use of data for research and statistics

Published:
4 October 2022
Last updated:
2 June 2023

Annex A: Glossary of terms

Accredited researcher:  Someone who has been trained to carry out quantitative research. Their credentials have been approved by an independent body, the UK Statistics Authority. They are allowed to securely access to de-identified unpublished data for a specific research purpose under the Digital Economy Act 2017 and Statistics and Registration Services Act 2007.

Administrative data:  Information created when people use public services, such as schools, hospitals, the courts, or the benefits system.

Data Access Committee: Data Access Committees evaluate applications from trained and accredited researchers for the use of de-identified data for research.

De-identified data: Personal information such as names and addresses have been removed from the data before it is shared with accredited researchers so that the data do not directly identify individuals and are not reasonably likely to lead to an individual’s identity being ascertained (whether on its own or taken together with other information).

Statistics: Producing information from data.  For example, collecting everyone’s age in the room is an example of data, but using that data to calculate an average age makes it into a statistic.

Synthetic data: Synthetic data is a version of a dataset that uses made up data rather than actual data, ranging from very low to very high levels of fidelity. The made-up data is generated at random and is made to follow some of the patterns of the original dataset. Like any data, synthetic data can only be accessed with permission and with the right kinds of safeguards around it. If synthetic data is shared with researchers, this is on the understanding that this is not real data, and is only being shared to raise awareness about how the real data is structured to support training and engagement. It is never ‘passed off’ as real data.

Back to top
Download PDF version (9.59 MB)