Processes: Update on recommendations

There has been mixed progress on our recommendations related to improving processes for applying for access to government data.

We have seen positive work in some areas. For example, UKSA and ADR UK have both published information on the Digital Economy Act (DEA, 2017) to help researchers wishing to access government data understand the relevant legislation.

Through discussions with stakeholders for this report, we have been made aware of additional process challenges. These include delays to output approvals by trusted research environments (TREs), barriers created by the requirement to publish all analyses enabled by the DEA (2017) Research Power and delays resulting from legal review of complex and non-standard data sharing agreements.

Academic and external organisation stakeholders have told us of frustrations around data owners and trusted research environments requiring review of outputs before publication and impacts on the ability to produce timely analysis because of the time this takes. Some have found output checking is taking longer than previously and that they are being asked to provide full reports when previously they were only required to share relevant sections.

The requirement to publish findings from analyses using government data has also been highlighted as a potential barrier to data sharing. Departments may not only be nervous about reputational risk from negative findings but also concerned that erroneous findings will be published from inappropriate analyses by analysts without a good understanding of the ways in which the data can and cannot be used. Government departments sharing data should provide detailed metadata and guidance on safe and unsafe uses of the data, as well as provide support to researchers throughout projects using their data.

The Pan-UK Data Governance Steering Group was established by the UK Health Data Research Alliance convened by HDR UK with the aim to simplify and streamline the governance processes surrounding data access. One priority area it has identified are delays caused by legal review of data access agreements, due to variation in contracts used across departments and their complexity. To tackle this barrier, the Steering Group produced a template for a data access agreement (DAA). The template is intended for use where data are accessed in a TRE or the NHS’s Secure Data Environments (SDEs) for the purposes of research and development for the public good. The principles underpinning the DAA were developed with significant consultation, and it is optimised for data science use across the UK. The aim of the DAA is to provide a familiar structure and terminology to build trust among data owners, researchers and the public. HDR UK is coordinating this work and driving forward adoption among the network of those hosting TREs and SDEs in the UK. The creation of this resource is positive, and we encourage departments to engage with HDR UK and review the template to explore whether they could adopt it. More broadly, initiatives such as the DARE UK-supported TRE Community indicate increasing collaboration among TREs, which will aid development of best practice and support projects which span multiple TREs.

To reflect the mixed progress made, stakeholder feedback and reflection from OSR on the suitability of our previous recommendations, several of the recommendations on processes have been updated.

Recommendation 9: Overview of Legislation

Original recommendation:

To help researchers understand the legislation relevant to data sharing and linkage and when it is appropriate to use each one, a single organisation in each nation should produce an overview of legislation that relates to data sharing, access and linkage, which explains when different pieces of legislation are relevant and where to find more information. This organisation does not need to be expert in all legislation but to be able to point people to those that are. The Office for Statistics Regulation (OSR) will help convene those in this space to understand more about who might be best placed to take this on.

Key findings

  • UKSA and ADR UK have published resources on relevant legislation for researchers wishing to access data under the Digital Economy Act (2017) and the Statistics and Registration Service Act (2007).
  • Gaps remain for researchers wishing to understand legislation relevant to health data access.

Summary of findings

Since our last report, several resources have been published that aim to improve researcher understanding of legislation relevant to data sharing and linkage. We have been told about the following resources:

  • The UK Statistics Authority (UKSA) has published an online resource providing answers to frequently asked questions about the Digital Economy Act (DEA, 2017) Research power and the Statistics and Registration Service Act (SRSA, 2007). This resource briefly covers what data can be accessed via the Research power and SRSA, information about accredited processing environments and who can access data and for what purposes.
  • ADR UK has launched an online Learning Hub that includes information on the DEA (2017) for researchers wishing to access administrative data under this legislation. It includes a slide deck produced by the UKSA that explains what the DEA 2017 Research power allows for and contains a visual map of the data access journey.

While these resources are positive examples of progress, gaps remain in the information available, for instance for researchers wishing to access health data. While the DEA 2017 provides a legal basis for health data access in certain circumstances, access to health data is often covered by alternative legal routes.

We continue to hear contrasting views on whether legislation is a barrier to data sharing or whether misinterpretation of the legislation and how it has been operationalised in policies and procedures, and the general nervousness around data sharing, creates a barrier. The independent review of the UKSA, published in March 2024, found that while “often cited as an excuse for not sharing, the legislative framework is in fact enabling.” Nonetheless, the conflation of government and academic research requirements in primary legislation continues to be cited by some in government as a limitation on interdepartmental sharing and access conditions through the risk of reidentifying bodies corporate. A comprehensive overview of legislation as described in our original recommendation is unlikely to address this conflict.

To reflect the resources that are now available, and following the recognition that an overview of legislation as described in our original recommendation may not meet researcher needs, we have closed this recommendation. Instead, a key priority is ensuring communication and clarity around the implementation of legislation and what it means for those wanting to access data. This is discussed and reflected in our Revised Recommendation 11: Communication and Clarity.


 

Recommendation 10: Broader use cases for data

Original recommendation:

To support re-use of data where appropriate, those creating data sharing agreements should consider whether restricting data access to a specific use case is essential or whether researchers could be allowed to explore other beneficial use cases, aiming to broaden the use case where possible.

Key findings

  • Requiring narrow use cases for data continues to be a barrier to effective data sharing, and there is some support for allowing broader use cases when appropriate.
  • Views continue to differ on what is permissible under current legislation, and clarity is needed about when broader use cases would be justified. We have heard examples of data access being approved for broader research themes during the COVID-19 pandemic.
  • Decisions about allowing broader use cases for data should consider ethical as well as legislative and practical issues.
  • Updating the Five Safes framework to change ‘safe projects’ to ‘safe programmes’ may facilitate and encourage a broadening of use cases for data by those creating data sharing agreements. The Research Accreditation Panel at UKSA committed to considering this proposal at its June 2024 strategic workshop.

Summary of findings

In our engagement for this follow-up report, we heard general support from several stakeholders for broadening use cases for data. Furthermore, requiring highly specific use cases continues to be a barrier to data access, research and the ability to use data in policymaking. Concerns that data use cases are too tightly defined to enable the use of data in policy development are particularly relevant to the success of the Integrated Data Service (IDS).

In February 2024, our Director General for Regulation, Ed Humpherson, published a blog post on The success and potential evolution of the 5 Safes model of data access discussing the five safes framework of data sharing and setting out the case for a change of focus from ‘safe projects’ to ‘safe programmes’. This proposal aims to enable those overseeing accreditation and data access to approve access for broad areas of research rather than a tightly defined research question. The blog post recognised that benefits could include a more efficient and flexible system, where researchers would not be required to define in detail the specific variables and analysis plans that they intend to use. It would also negate the need for lengthy reapplications to answer related research questions.

Conversely, there are specific concerns about the recommendation to broaden use cases for data and the proposal to change to safe programmes, including from some stakeholders who thought approving broader use would not align with current legislation. There are concerns that safe programmes would not be in the spirit of the security focus of the Five Safes Framework, and one stakeholder raised ethical concerns about sharing data for broad justifications. Others, however, assess that legislation does allow for broader use cases and it is instead the interpretation of legislative restraints that has led to a narrowing of use cases. Different organisations have different uses of the terms project and programmes, and in some cases, such as during the pandemic, projects with broader research questions had already been approved. These varied views demonstrate a clear and continuing lack of agreement across stakeholders in terms of what is currently allowable. Clarity is needed around whether broader use cases can be considered ‘safe’ and when they would be justified.

The Research Accreditation Panel at UKSA held a strategic workshop in which it considered the framework for accreditation under the DEA (2017) and discussed the proposal to change to safe programmes. As part of this work, we encourage UKSA to clarify what is legal and practical under the current legislative framework.

Adopting this recommendation has the potential to make the process of accessing data more efficient and less burdensome for researchers and data owners, and to better enable the use of government data in research and policymaking. A change to safe programmes may be a route to enabling broader use cases, highlighting the relevance of a review of the Five Safes (Recommendation 3). There are outstanding questions around the legal and ethical implications of such a change, and assurance is needed around when a broader use case is acceptable. An important consideration will be the requirement for public engagement on any broader use cases for data. We have amended this recommendation to suggest the UKSA take a lead on this in line with their work on the Five Safes Framework.

Revised Recommendation 10: Broader use cases for data

To support re-use of data where appropriate, data owners, those overseeing accreditation and access to data held in secure environments, and those creating data sharing agreements should consider whether restricting data access to a specific use case is essential or whether researchers could be allowed to explore other beneficial use cases, aiming to broaden the use case wherever possible. Given the UK Statistics Authority’s commitment to consider such a change and the overlap with Recommendation 3 (The Five Safes Framework), we think it is well placed to take a lead on this proposal.


 

Recommendation 11: Communication

Original recommendation:

To ensure data application processes are fit-for purpose and well understood, those overseeing accreditation and access to data held in secure environments should prioritise ongoing communication with users, data owners and the public to explain and refine the information required. Wherever possible, they should offer face-to-face or virtual discussions with those applying to access data early in the process, to ensure clarity around both the data required and the process to access it.

Key findings

  • A lack of clarity continues around the legal basis for information required from those applying for data access, and researchers have not received sufficient assurance.
  • Concerns remain that data owners and those responsible for accrediting decisions are creating unnecessary barriers to data access by requiring information beyond the legislative need.
  • Data owners and those responsible for accrediting decisions should assure themselves that their policies and procedures align with legislative requirements, making changes where possible to reduce the burden on applicants.

Summary of findings

For the previous report, OSR was told that communication between those overseeing accreditation and access to data and researchers is often challenging. Researchers experience long delays in receiving feedback on applications and report a lack of clarity around why some questions are asked, such as those requiring specific details of planned statistical methodology. Communication should be prioritised to refine and explain the information required.

We continue to hear about similar difficulties with communication. Researchers using one secure research environment have reported significant increases in wait times for data access and output approval without any clear explanation for the delays. We consider that TREs can do more to communicate with researchers about wait times for data access and output approval. TREs and accreditors should consider collecting and publishing metrics on the time taken for each stage of the accreditation, output checking and data access process. This would not only aid transparency, but also help researchers to better gauge and plan timelines for their projects.

Concerns also continue to be raised about the details researchers are asked to provide when applying for research project accreditation from UKSA. We have again been told that requiring researchers to provide such detailed information on analysis plans creates a barrier, since such questions can be hard to answer before researchers have explored and developed an understanding of the data.

UKSA told us that the information requested is determined by what legislation requires, ensuring projects can be assessed against the principles and conditions in the statutory Research Code of Practice and Accreditation Criteria. Research project application guidance from UKSA advises that methodological information is required as the suitability of analysis plans is relevant to determining whether a project has potential to serve the public interest.

Data owners and those responsible for accrediting decisions should ensure they are not creating unnecessary barriers to data access. They should be transparent about the legal basis for the information they require from those applying for data access. They should assure themselves that their policies and procedures align with legislative requirements, making changes where possible to reduce the burden of requiring specific information on applicants. We hope that the Research Accreditation Panel’s strategic workshop is the start of open discussion on this topic. Those processing data requests should provide greater clarity to researchers on why certain details are requested, with reference to relevant parts of the legislation and the Research Code of Practice and Accreditation Criteria.

UKSA is working with the Integrated Data Service (IDS) to provide pop-up information to researchers completing data access applications. This is a promising step forward, and we encourage UKSA and the IDS to ensure sufficient detail is provided, with links to additional resources should a researcher wish to know more. UKSA could also update its research project application guidance to provide more clarity on why it requires information on specific elements of methodology.

We continue to see an ongoing need for improved communication, with no significant progress having been made against the recommendation. Recognising the continued lack of explanation given to researchers around why certain details are required in project applications, we have amended the recommendation to take into account a need for reflection and review by those overseeing accreditation and access to data.

Revised Recommendation 11: Clarity and Communication

To ensure data application processes are fit for purpose and well understood, those overseeing accreditation and access to data held in secure environments should prioritise ongoing communication with users, data owners and the public to explain and refine the information required. This communication should include transparency as to what the data will be used for. Those overseeing accreditation and access to data, including the UKSA, should aim to reduce the administrative burden on applicants as much as possible, assuring themselves that their policies and procedures align with legislative requirements. Wherever possible, they should offer face-to-face or virtual discussions with those applying to access data early in the process to ensure clarity around both the data required and the process to access it.


 

Recommendation 12: Checklists

Original recommendation:

To ensure all necessary teams are involved at the outset of a data sharing and linking project, organisations should consider the use of a checklist for those initiating data sharing. The checklist should contain all contacts and teams within their organisation who need to be consulted to avoid last minute delays.

Key findings

  • Government departments need clarity across their organisation about which teams handle data access requests.
  • Departments should consider having a single team responsible for requesting data from other departments and for coordinating requests made to their own department.

Summary of findings

Bringing together all the relevant people and teams within an organisation to facilitate the sharing of data is challenging. In addition, researchers can struggle to find the right person to speak to about a dataset or about access processes. Often, there is a lack of clarity within government departments as to who the relevant person or team is. As such, we previously recommended the use of checklists by organisations to ensure all relevant parties are involved and informed at the outset of a data sharing and linkage project.

Through engagement for this follow-up report, we found that a related challenge for some government departments is that they sometimes receive multiple requests for similar data from different teams at one department. This demonstrates a lack of communication and coordination around data needs within organisations. Departments should consider having a mechanism for coordinating data requests across all business areas to improve efficiency and reduce demands on departments to which data requests are being made.

To better understand these challenges and how they can be tackled, OSR needs to engage more directly with data owners and researchers in the future. On reflection, our specific focus on checklists may not be useful; instead, departments should take relevant action, which may include the use of a checklist, to ensure that the right people are involved and identified from the start of a data sharing and linkage project. To reflect this recognition, we have closed this recommendation.


 

Recommendation 13: Transparency

Original recommendation:

Every organisation within government should be transparent about how the data they hold can be accessed and the process to follow. This guidance should be presented clearly and be available in the public domain with a support inbox or service for questions relating to the process.

Key findings

  • There are many examples of public bodies publishing information about what data they hold, how data can be accessed and the process to follow.
  • The CDDO is developing a data marketplace that should drastically improve the discoverability of data held across government, but we would encourage the CDDO to make this resource publicly available to support external researchers.

Summary of findings

Our previous report recommended that government organisations need to be transparent about how the data they hold can be accessed and the process to follow. We have seen progress against this recommendation, with several positive examples of work in this area:

  • The CDDO is developing a data marketplace to improve the discoverability of data within government. The marketplace provides a central facility allowing those within government to find out what data are held and how data can be accessed. The data marketplace supports data discovery through the adoption of consistent metadata standards. It will provide a managed catalogue of sharable resources to support departments in being able to promote data that can be shared. It will also standardise the process by which data sharing can be agreed. This resource could provide significant benefits. The CDDO should prioritise making this resource publicly available to improve transparency and ensure external researchers are also able to make use of data held by government. Furthermore, we support a long-term approach to this project in which resources are allocated to enable sustained activity.
  • ADR UK has created a searchable public metadata catalogue that contains a large amount of information about the datasets held across the ADR UK partnership. The catalogue includes a webpage for each dataset with contact details and links to information on how to access the data, as well as description of the dataset, information on coverage, and for some datasets, downloadable metadata.
  • HDR UK has created a searchable public metadata catalogue which contains information from over 850 different health-related datasets across the UK. As with the ADR UK catalogue, the Health Data Research Gateway includes a webpage for each dataset with contact details and links to information on how to access the data, as well as description of the dataset, information on coverage, and for some datasets, downloadable metadata documentation. The Gateway integrates with the Researcher Registry, included as a data use register (implementing the Pan-UK Data Governance Steering Group standard supporting transparency discussed below), and also implements the Data Access Request Form Standard (for data custodians who have adopted the standard).
  • The Pan-UK Data Governance Steering Group was established by the UK Health Data Research Alliance convened by HDR UK with the aim to simplify and streamline data access governance processes. One of its priority areas is improving the transparency of processes for accessing health and health-relevant data for research. Transparent and clear information about the safe and secure access to and use of health data enables researchers to navigate data access processes and helps build and maintain public trust. The Steering Group co-developed and published Transparency Standards with HDR UK’s Public Advisory Board (PAB) to guide good practice. These standards highlight how the principles of transparency can be met by publishing open access data use registers. With support from the Medical Research Council, in 2023, 19 organisationswere awarded funding to adopt the Transparency Standards. The outputs from these awards are published here: Vol 9; Conference proceedings for UK health Data Research Alliance Transparency Showcase.
  • The MoJ Data First Programme is an ambitious project with the aim of unlocking the insight stored within administrative datasets across the justice system. The MoJ has published clear information on what data are available, explained how researchers can apply for access and provided contact information so queries can be directed towards relevant teams.
  • Research Data Scotland runs a Researcher Access Service for those wishing to access public data in Scotland. It publishes extensive information to support this service, including a data access overview describing the stages of applying for data access – from discovering what data are available to receiving access – with links to relevant resources.

These initiatives are reassuring, but transparency around data access remains variable across government departments. Stakeholders told us that transparency has recently become a lower priority for some departments, following a push to publish data catalogues as open data several years ago.

ONS’s SRS metadata catalogue is publicly available, so both the public and researchers can see the data that are being made available for research use. However, ONS is planning to decommission the SRS as the IDS is planned to take over. At present, you need to be an Accredited Researcher to access the IDS metadata catalogue. If this is still the case when the IDS takes over, it could result in a reduction in public transparency about how data are being used.

Stakeholders informed us that some departments are hesitant to make information public about what data they hold, as this may have implications for resources and reputational risk. For instance, departments may be concerned about transparency leading to increased requests for data and data removal. Conversely, many departments are happy to have information on their data made public. Departments who have positive experiences of being transparent about data they hold should share their learnings across government.

We continue to encourage all departments to be open about what data they hold and how they can be accessed. Departments should explicitly state on their statistical webpages what data they hold and what process should be followed to gain access to them.

Following feedback, we recognise that some government organisations may themselves have a poor understanding of the data they hold and have updated the recommendation to reflect this.

Revised Recommendation 13: Transparency

Every organisation within government should be transparent about what data it holds, their potential uses, how these data can be accessed and the process to follow. This guidance should be presented clearly and be available in the public domain with a support inbox or service for questions relating to the process. Given its work developing a data marketplace, we consider the CDDO to be well placed to take a lead on encouraging and supporting organisations to implement these recommendations.


 

Recommendation 14: Funding Structure

Original recommendation:

To allow every organisation a consistent funding stream for their projects, a centralised government funding structure for data collaboration projects across government, such as the Shared Outcomes Fund, should be maintained and expanded.

Key findings

  • Sufficient resourcing remains a barrier to efficient, timely and cost-effective data sharing by government departments.
  • A centralised government funding structure for data collaboration projects across government would benefit system-wide approaches to sharing.
  • Investment focussed on access-based developments, as well as specific sharing initiatives, will aid a sustainable approach to collaboration.

Summary of findings

Our last report recommended a centralised government funding structure to enable greater opportunities for data collaboration projects across government. Specific initiatives are improving the culture towards data sharing, successful funding remains highly dependent on the priorities and vision within each department, and that resourcing remains problematic. Expensive IT systems, which often need to be developed to enable specific linkage projects, can be a barrier to cost-effective collaboration. Preparing datasets for sharing can be costly, time consuming and resource intensive. As part of the ongoing progress of data sharing and linkage in government, we expect departments holding key data assets to be sufficiently funded to provide these services, without burdening those who require access to data. Hence, we are updating our recommendation to include a call for additional resources in more areas related to data sharing and linkage. We also consider that the requirement for an enhanced centralised funding structure, likely coordinated through DSIT, should remain. Finally, we see an opportunity for the IDS, as a cross-government service, to help overcome the costly technological constraints on linkage experienced by individual departments.

Our previous report recommended maintaining the Shared Outcomes Fund, which had funded programmes such as the Ministry of Justice’s Better Outcomes through Linked Data (BOLD) initiative. We were pleased to see that funding has been allocated to a cross-government linkage project in Round 3, as announced in November 2023. The Refugee Integration Outcomes initiative will create an anonymised dataset of refugee integration outcomes based on linking Home Office refugee data to census and cross-government administrative data.

However, funding individual projects in isolation could deepen fragmentation in the system. As such, in addition to investing in collaborative projects, senior leaders across government should look for opportunities to fund strategic system-level development in their areas of responsibility. By prioritising an access-based approach that focuses investment in technical development (including IT systems), data cataloguing, upskilling and public engagement, departments would be better placed to then engage in specific data sharing initiatives in the future in a more sustainable manner. Political leadership grounded in a comprehensive understanding of the benefits and requirements of sharing and linking data is crucial to achieving a consistent and sustained funding stream.

BOLD, which is led by the Ministry of Justice (MoJ), is a data-linking programme which aims to improve the connectedness of government data in England and Wales. The programme was created to demonstrate how people with complex needs can be better supported by linking and improving the government data held on them in a safe and secure way. A cross-government system-level development, BOLD uses pseudonymised data from the Ministry of Justice, Department of Health and Social Care, the Department of Levelling Up, Housing and Communities, Public Health Wales and the Welsh Government. Pseudonymisation is a technique that replaces or removes information in a dataset that identifies an individual. Privacy, legal requirements and robust ethical standards are at the heart of BOLD’s design and ethos. BOLD consists of four data and analysis pilot projects: reducing homelessness, supporting victims of crime, reducing substance misuse and reducing reoffending. Funding for the initiative was provided by the Shared Outcomes Fund.

Revised Recommendation 14: Funding Structure and Resourcing

To allow every organisation a consistent and sustainable funding stream for their projects, a centralised government funding structure for data collaboration projects across government should be established. This structure should prioritise a system-level, access-based approach to investment, as well as continue and expand initiatives such as the Shared Outcomes Fund. Senior leaders should ensure there are sufficient resources allocated to developing data sharing and linkage capabilities in their own departments.

 

Back to top
Download PDF version (442.68 KB)