This publication was updated in April 2022 to migrate the contents to HTML and improve accessibility.
“Data is more useful when more people can access and use it. It is most useful when it can be joined together. Data that is inaccessible – or where access takes so long it is rendered irrelevant – is of limited utility.” – Jeni Tennison, CEO of the Open Data Institute
The central purpose for all official statistics producers is serving the public good through the provision of data and statistics. This obligation is reflected in the principles of the Code of Practice for Statistics which requires statistics producers to commit to, and to promote, the safe onward access to the data used as the basis for producing official statistics. These may include, for example, data from the census, population and business surveys, as well as administrative records.
This guidance is a companion to our guidance on data governance: building confidence in the handling and use of data, which supports data sharing for the public good. It is aimed at Heads of Profession for Statistics and analysts working in producer bodies with an interest in data linkage and sharing.
About this guide
We have written this guidance to increase awareness among statistics producers and users that the principles of the Code of Practice extend beyond statistics production to data sharing and access. We outline practices and processes that uphold these principles. Specific guidance about how to meet these expectations is signposted where available.
We have expectations of producers in three distinct areas:
- Data standards, quality and curation
- Data provision
- Developments to official statistics
Data standards, quality and curation
Documented, consistent, linkable, timely, curated, trackable, reproducible, explorable
Communication and engagement, user-informed processes, transparent processes, appeal and redress, professional development
Developments to official statistics
Improving measures and methods, new insights, improving data quality
This guidance applies to:
- providing freely available data as well as more-controlled data (including in safe settings)
- official statistics producers who provide their data via other organisations and those operating their own data access services
- official statistics producers who are also data processors accredited under the Digital Economy Act 2017
Data can be supplied in different formats with restrictions set according to the level of risk of reidentifying data subjects. The corresponding need for added safeguarding measures can also be incorporated, when the steps required will increase in line with the reidentification risk. Completely anonymous open data with no restrictions on users or uses sit at one end of this spectrum. At the other end there is potentially identifiable data that can be accessed only via a terminal on an isolated network in a secure setting, for specific purposes, to approved researchers, with checks made of any analytic outputs. In between these two scenarios there are still restrictions on uses, users and access settings, but these are less extensive. Data owners are responsible for deciding where on this spectrum their data sit, based on the legislation governing their access and the risk environment in which they operate. Sound data governance is underpinned by transparency about such decisions.
Responsibility for data governance and data supply can be split between organisations. For example, the Department for Education in England makes decisions about who can use the data it holds, while access is provided via the ONS Secure Research Service, and direct supply of data to users is permissible only in limited circumstances. The Welsh Government uses the Secure Anonymised Information Linkage (SAIL) Databank to manage access to much of its data. Access to many government surveys is managed by the UK Data Service on departments’ behalf.
The DEA created a legal gateway to share de-identified data from public bodies with researchers, to carry out analysis in the public interest, but it specifically does not include data from health and social care services. Before data can be shared for research purposes, it must be processed by an accredited processor so that the data is ‘de-identified’. When the data has been de-identified it can be made available to an accredited researcher in a secure environment, and the processor will ensure that any data (or any analysis based on the data) retained by the researcher, or are published, are ‘disclosure controlled’ to minimise the risk of data subjects being re-identified or other misuses of the data. The accreditation process for researchers, projects and data processors is overseen by the UK Statistics Authority, with decisions taken by an independent Research Accreditation Panel. Access is then provided by one of the accredited processors whose locations span the UK (eight had been accredited as of July 2020). The DEA’s Research Code of Practice and Accreditation Criteria sets the standards for this framework.
This regulatory guidance complements the DEA Research Code of Practice. We set out our wider expectations of official statistics producers as providers of open or safeguarded data, some of whom will also be accredited processors under the DEA, but most will not.
The following sections outline our expectations of statistics producers. The Annex contains the specific sections of the Code of Practice for Statistics that producers should pay attention to when reviewing their approaches to data provision and access. We recognise that some of these expectations will be challenging to meet, especially those relating to new areas still undergoing development (for example, reproducibility in safe settings and synthetic data). We have included these more-challenging expectations to highlight areas where statistics producers can work to innovate and improve data provision in the long-term.Back to top