Introduction
Data sharing and linkage for the public good
Every day, government organisations generate data that have the potential to serve the public good. These data can hold the key to understanding and answering society’s most pressing questions. Within government, data can inform the delivery of vital public services, policy developments, evaluation, and answer valuable research questions. Beyond government, data can be a powerful tool that enables organisations and individuals to hold the government to account and to make their own decisions.
When data are shared and linked across government this potential is magnified, enriching insights into society, stimulating innovation and ultimately enabling data, and government, to better serve the public good. Opening up access to data beyond government can significantly increase the analytical capacity to use them for public good, whether it is by feeding evidence back into government or through allowing organisations to make their own decisions.
There are powerful examples that illustrate the value of sharing and linking data across multiple sectors. For example, the Office for National Statistics (ONS) recently published statistics on sociodemographic inequalities in suicides. ONS linked demographic and socioeconomic data about individuals from the 2011 Census with death registration data and, for the first time, was able to show estimates for rates of suicide across a wide range of different demographic groups. ONS believes this analysis will support the development of more effective suicide prevention strategies. Further examples come from Data First, an ambitious data-linking programme led by the Ministry of Justice (MoJ) and funded by Administrative Data Research UK (ADR UK) (see Box 1). Data First aims to unlock the potential of MoJ data by linking administrative datasets from across the justice system and enabling accredited researchers, from within government and academia, to access the data. Data First is also enhancing the linking of justice data with data from other government departments, such as the Department for Education, where linking data has unlocked a wealth of information for researchers about young people who interact with the criminal justice system. Also in the education space, ADR Northern Ireland has recently launched the Education Outcomes Linkage (EOL) 2018/19, a longitudinal database comprised of post-primary schools’ data in Northern Ireland, delivered in partnership with the Department of Education and the Department for the Economy in Northern Ireland. A key feature of this project has been prioritising stakeholder and researcher engagement from design to completion, helping to ensure EOL can maximise its goal to drive policy focused research. Finally, the pandemic also provided examples of data sharing and linkage that improved public understanding of the differential impacts of COVID-19 on various population groups.
All these initiatives demonstrate how data sharing and linkage can deliver insights that enable the design of policies that better serve vulnerable groups of society. By looking at them from a different angle and considering the loss if they had not been possible, they also serve to illustrate the enormous cost of missed opportunity if data are not shared or linked, especially when preparing to respond to a national crisis.
What do we mean by data sharing and linkage?
The concept of data sharing is relatively straightforward and involves data normally created in one business area or organisation moving to another. Within this report, we also talk about data access, which reflects the fact that a lot of data sharing across government is now achieved via organisations contributing data to IT platforms, such as databases or modern cloud repositories. These platforms can then enable access to multiple others, removing the need for the source organisation to repeatedly share data themselves. Data sharing and data access often rely on organisations having a common purpose and arrangements, such as an agreement to share data.
Data linkage is the process of joining datasets together. Data that are shared between organisations are often shared with the intention of linking them to further datasets to enhance or improve the data. Data sharing and data linkage are often considered together in this report but, where distinctions exist, these will be made clear. Both data sharing and data linkage come with challenges, which this report will explore.
Why is OSR reporting on data sharing and linkage now?
At the Office for Statistics Regulation (OSR) we see the immense value of data sharing and linkage for decision makers and the wider public. As the independent regulator of the UK’s statistical system, OSR is an advocate and a champion for data sharing and linkage, when this is done in a secure way that maintains public trust. It is our ambition that sharing and linking datasets, and using them for research and evaluation, will become the norm across the UK statistical system.
OSR has a unique perspective in sharing and linking data: our vision that statistics should serve the public good means that we have a focus on the availability of data, and analysis that draws on linked data, for individuals and organisations working outside of the government, which others may not. We see a role for OSR as champions of wider value, ensuring opportunities and benefits from linked data are realised by groups beyond government and the public sector, such as academic researchers, so that they can better serve the public good.
In 2018 we published our report Joining Up Data. We identified six key outcomes necessary to achieve a safe and effective data linkage system, underpinned by Trustworthiness, Quality and Value, the three pillars of our Code of Practice for Statistics. In 2019, we published an update report in which we were able to identify progress towards achieving those six key outcomes in several areas.
Since then, there have been several notable changes within the data sharing landscape that have helped to accelerate developments within the statistical system. For example, the powers given to the statistical system via the Digital Economy Act 2017 are now more embedded and have helped to unlock and facilitate access for data sharing and linkage. In addition to this, the Integrated Data Service (IDS) (see Box 2) is being developed as a cross government and researcher service, allowing coordinated and secure access to data for the public good. Meanwhile, strong collaboration between the UK statistical system and ADR UK has supported linkage and sharing of administrative datasets within and across organisations in all four UK Nations and is helping to make them available to accredited researchers within and beyond government in a safe and secure way.
The coronavirus (COVID-19) pandemic has also had a huge impact on the data sharing and linkage landscape. In response to the desire to answer societal questions concerning COVID-19, the statistical system showed an agile, collaborative and willing approach to sharing and linking data on many topics, including aspects of health, crime, income and housing, spreading across both the private and public sectors. During this unprecedented period, the willingness to share data was driven by a common purpose – the desire to help vulnerable people and ultimately to save lives.
Finally, there is growing evidence that people in the UK want, and expect, data to be used when it is done securely and transparently. There is an expectation by some among the public that their data are already being shared and linked within the public sector for the public good. This position is explored by Data and Analytics Research Environments UK (DARE UK) in its blog ‘Trustworthiness of sensitive data research is about more than just privacy and security’.
While there has been some excellent progress in creating linked datasets and making them available for research, analysis and statistics, data sharing and linkage within the government sector now stands at a crossroads. Despite progress, we know there remain areas of challenge around sharing and linkage, and around wider access to data to researchers outside government; and there is a lack of awareness of and uncertainties about the public’s attitude to and confidence in data sharing and linkage. These and other areas of challenge have been highlighted by organisations both within and beyond government, including the Social Mobility Commission and the Institute for Government respectively.
This report
This report focuses on how we can empower government to prioritise data sharing and linkage for research purposes, enabling greater data sharing and linkage for the public good. It takes stock of the data sharing and linkage being done for research across government and points the way to build on recent successes and confront the more ingrained challenges.
To inform our position we spoke to stakeholders from across the public sector, including government departments, cross-government linkage projects, trusted research environments (TREs), devolved administrations, data partnerships and government researchers. We explored current barriers to data sharing and linkage from their perspectives and sought to understand opportunities and hopes for the future. Full discussion of our methodology is given in Annex A.
In Chapter 1 we discuss the findings of our interviews with stakeholders, which were conducted between September 2022 and January 2023. This includes the barriers and opportunities that exist in this area, examples of success stories and what can be learnt from them, and further areas of interest that we deem important to understanding the landscape of data sharing and linkage. We make 16 recommendations that, if realised, would enable greater data sharing and linkage for the public good.
Chapter 2 looks to the future of data sharing and linkage in government and presents four possible ‘future scenarios’ for data sharing and linkage, set five years from now. We illustrate how our recommendations from Chapter 1 can indicate a pathway to data sharing and linkage for the public good.
Information Box 1: Administrative Data Research UK (ADR UK)
ADR UK is a UK data partnership, funded by UK Research and Innovation (UKRI), with a mission to transform the way researchers access the UK’s wealth of public sector data, to enable better informed policy decisions that improve people’s lives.
The partnership is coordinated by a UK-wide Strategic Hub, which manages a dedicated fund for commissioning research using newly linked administrative data. The Strategic Hub, along with the other ADR partnerships, also promotes the benefits of administrative data research to the public and the wider research community and engages with governments of the UK to secure access to data.
Information Box 2: The Integrated Data Service (IDS)
The IDS is a cross–government project, led by the Office for National Statistics (ONS). It builds on the ONS Secure Research Service, which has been providing secure access to de-identified, unpublished data to accredited researchers for over 15 years. The IDS is a central platform that provides access to data, analytical and visual tools in a secure multi-cloud infrastructure. It aims to be the single data analysis and dissemination platform within government by providing secure and co-ordinated access to a range of high-quality data for government analysts, devolved administrations and external accredited researchers.
In March 2023, IDS entered its Public Beta phase, which marks a step forward in achieving the vision of bringing together ready-to-use data for the public good by expanding the IDS user base and functionality, and offering additional data for analysis, on a safe and secure platform. Data assets available through the platform are listed on the IDS website.