Data sharing and data linking serves the public good by providing greater insight and helping to tell richer stories. One of the OSR’s ambitions is that in 2025, the statistical system will be based around linked data sets. Sharing and linked datasets, and using them for research and evaluation, will no longer be the exception. It will be the norm.
Over the past year, collaboration and data sharing across the statistical and wider analytical system has been a great strength. Collaboration involves working jointly and is one of the cross-cutting themes of the Code of Practice. The pandemic has highlighted the need to work together to really serve the public good – for example, the need for health data producers to share data to understand the impact of the pandemic on public health and the operations of the NHS. In the wider statistical system, sharing data has helped the public and decision-makers understand the real-time impact of the pandemic on areas such as employment and the economy.
Case Study 4: Earnings and employment from Pay As You Earn Real Time Information
During the pandemic, ONS and HMRC accelerated their plans to develop Pay as You Earn (PAYE) Real Time Information (RTI) estimates of employment and earnings. The Earnings and employment from PAYE RTI is now a joint monthly experimental release that draws from HMRC’s PAYE RTI system which covers all payrolled employees and therefore allows for more detailed estimates of employees than a sample based approach, as well as information on pay, sector, age and geographic location.
Rather than waiting until the development work has been completed, the statistics are being published now to involve potential users in their development. An RTI Labour Market Statistics Steering Group was set up to provide feedback and input into the continuing development of the statistics. This group includes a range of internal and external users.
At the start of the pandemic all face to face interviewing for the Labour Force Survey (LFS) was suspended, switching to telephone only interviewing. Over time it became clear that these changes had introduced a change to the non-response bias to the survey. The PAYE RTI estimates became an important source of information for understanding the impact of the pandemic on the labour market across the UK.
The proactive collaboration and data sharing between ONS and HMRC has continued beyond the initial publication of these statistics. In March 2021, following a decrease in foreign-born workers appearing in the LFS data, ONS published analysis using the HMRC PAYE RTI data, to investigate the validity of changes in the LFS data. It found much smaller changes in the number of non-UK nationals in the RTI data compared with LFS data. In July 2021, the LFS will be reweighted making use of information from the payroll tax system to provide population weights.
Collaboration, data sharing and data linkage across these shared datasets can really provide insight. For example, during the pandemic linking data from the census with mortality data helped to improve public understanding of the differential impacts of COVID-19 on various population groups (see case study 5). By increasing data sharing and data linkage, the ONS has set a good precedent for other producers and for future work to consider what work can be done to provide greater insight for statistics users.
Back to top
Case study 5: COVID-19 related deaths by ethnic group for England and Wales
The Office for National Statistics (ONS) published COVID-19 related deaths by ethnic group for England and Wales in May 2020. By linking 2011 Census data to mortality records on deaths registrations, ONS was able to analyse deaths by self-reported ethnicity and take account of demographic, social and geographic characteristics also associated with the risk of infection and death, thus providing greater insight into the impacts of the pandemic on different ethnic groups.
In October 2020 extended its analyses to include measures of comorbidity retrieved from hospital measures during the past three years. For this updated publication ONS used a unique dataset that linked Census 2011 records, death registrations in England and Wales, and Hospital Episode Statistics (HES). It built on knowledge gained from previous research to investigate the possibility that the distribution of certain pre-existing health conditions across ethnic groups might account for the disparities in COVID-19 mortality between ethnic groups that were originally observed, even after adjusting for geographic, demographic and socioeconomic factors.
In an additional update in May 2021, ONS compared deaths in different ethnic groups between the first and second waves of the pandemic. By linking 2011 Census data to the General Practice Extraction Service Data for Pandemic Planning and Research, ONS was able to assess the extent to which the increased risk of COVID-19 mortality in some ethnic groups is explained by differences in the prevalence of certain pre-existing health conditions, which are known to increase the risk of dying from COVID-19.
However, this improved collaboration and data sharing has not been without its challenges. For example, public health, social care and hospital administrative systems are often not connected to one another, which makes it time-consuming to collate the data and can result in duplication of work or data gaps. The urgency of a pandemic has shown what is possible but overcoming these barriers may be more challenging in future without this shared sense of purpose. We will be publishing a report on the lessons learnt from the pandemic for health and social care statistics which will cover these issues in more detail.
Looking to the future
Updating the IT infrastructure and data governance to make it possible to share information in an even more efficient way will be important. In our report Unlocking the value of data through onward sharing we provide guidance for statistical producers on how to share data and provide access to data in line with the Code of Practice. Our report gives specific examples of how other producers have approached data sharing and provides links to networks to help with this. In the coming year we will be reviewing the impact of data linkage on the statistical system to understand progress since our report in 2019.
Ultimately, we believe that sharing and linking data and research in a secure and ethical way can really add value and deliver public good and that the future statistical system should build on the momentum of the last year. The past year is a great example of what can be achieved and why data sharing and linkage is so important. The UK statistical system must also seek to collaborate beyond statistical and analytical boundaries, working with digital and data colleagues, to maximise the value of analytical datasets.Back to top