Data gaps: no need to be daunted

We are, apparently, living in a world awash with data. Data are available in ever greater quantity and with ever greater frequency. Yet alongside this abundance, there is a realisation that data gaps still exist. “What is the impact of the COVID-19 pandemic on people, households and communities in Great Britain?” This is a question that has been asked a lot over the last 6 months, and the UK’s government statisticians have been answering it. In doing so, they have been filling a data gap. Yet addressing data gaps can perhaps sound a bit daunting to statistical producers.

At OSR, much of our work involves uncovering and helping statisticians focus on addressing data gaps. Whether it’s statistics about adult social care, or post-16 education and skills, our work often highlights ways in which user needs are not being met by current statistical products.

This has long been the case. What’s new is that we are starting to pull together insight from all our activities into the question of data gaps.

It’s early days for this cross-cutting analysis. Our intention in this blog is to start the process of demystifying data gaps, building on our growing understanding of how different statistical producers have addressed them.

Raise your awareness

To make sense of data gaps, first producers must become aware of them. This awareness is not always easy for people who are responsible for collecting data inside a system. It’s easy to focus on the data that you have, and how to make use of it. Yet data gaps are, by definition, about the data that you don’t have.

This is where user feedback is so crucial. We have found that data gaps are largely identified by groups of users or social discourse, rather than by those responsible for compiling the statistics. Gaps in information can also be thrown open by external events. For example, the COVID-19 pandemic or the Grenfell Tower tragedy have both stimulated demand for new statistics, creating gaps that didn’t previously exist.

Understand the nature of the gap

The next important step is to understand the nature of the gap. Is it a problem with the way that data are defined, collected, or presented – so that there is an intrinsic weakness in the dataset? If so, it might take substantial commitment and time to address, and so the most important thing for a producer to do immediately is to highlight the gap and be honest about the limitations of the data. For example, in England and Wales, deaths should be registered within five days of the death occurring, but there are some situations that result in the registration of the death being delayed.

Address the gap

Beyond awareness and understanding, producers have shown that they can address gaps – they can provide data and analysis to fill in the gaps in people’s understanding. ONS has done this repeatedly during the pandemic by publishing statistics from a new household survey to estimate the number of current positive cases of COVID-19 in the community, as have statistics producers in Wales, Scotland, and Northern Ireland.

We are still building up our case examples of what works to address gaps, but one striking early conclusion is the importance of collaboration between different producers. Over recent years, we are pleased to keep writing that departments are innovating in data collection or processing methods to fill gaps e.g. web scraping (ONS using scanner data for estimating CPIH) or linking data (ONS developing the UN Sustainable Development Goals) or sharing data (Ministry of Justice developing person-level records). Producers also show continuous innovation to plug data gaps e.g. publishing experimental statistics and management information and voluntarily applying the Code of Practice.

This is just the start of our work on data gaps. We want to build a much richer evidence base on where gaps arise, and the crucial steps taken by those who successfully address them.

But one thing is clear to us: where statistical producers show openness and appetite to improve statistics, listen closely to their users and work together, they have a much better chance of addressing gaps than if they prefer to hunker down into their existing data, analysis and ways of doing things.

With the right leadership approach, and an open mindset, there’s no need to be daunted by data gaps.

Closing data gaps: understanding the impact of Covid-19 on income

In recent weeks, you may have spoken with friends and family who’ve seen their income and living standards impacted in some way by COVID-19. They may have been furloughed and are concerned about whether they will have a job to return to or perhaps they have experienced a reduction in business if they are self-employed. Maybe your own household is receiving less income and you are struggling to juggle household costs with home schooling.

Despite the UK starting to ease the lockdown measures it introduced in response to COVID-19, the impact of this pandemic on the labour market and people’s livelihoods is expected to continue for some time. We are already seeing signs of the scale of the impact on the labour market; from vacancies at a record low in May to new claims to Universal Credit passing 2.5 million between March and June. The Office for National Statistics (ONS) recently brought forward the launch of its online Labour Market Survey to help provide the necessary insight into the impact of COVID-19 on people’s employment and working patterns.

There is a range of data which can help us understand how jobs and employment have been affected but we need better data on income and earnings to fully understand the narrative of how people’s livelihoods and living standards are being affected by the pandemic. A recent Opinions and Lifestyle Survey by the ONS found that half of the self-employed reported a loss of household income, compared with 22% of employees, in the month of April. Last year, we wrote to the ONS, Department for Work and Pensions and HMRC to restate the importance of delivering the insights identified in our work on the Coherence and Accessibility of Official Statistics on Income and Earnings. Whilst some progress has been made since our findings were published in 2014, it has been slow to date and more work needs to be done to help users understand the dynamics of the labour market and to address key data gaps in relation to income and earnings.

We have recently carried out work to look at examples of data gaps being addressed in the statistical system. Our work found three common themes in successful cases of solving data gaps: sufficient resource (whether new or restructured), high user demand and strong statistical leadership. The combination of new user demand for information on income and earnings that has emerged from COVID-19, restructured resource that has been put in place to respond to this demand, and the potential for statistical leadership to shine, could be the catalyst for solving these data gaps.

Improving the storytelling of income and earnings and addressing the data gaps identified by OSR could help users better understand the lived experience of households and different employment types throughout the pandemic. These are difficult times for many people from all walks of life and people are facing lots of unknowns. It is important that we can understand the true scale of the impact so that when the UK begins its recovery from the pandemic, support can be targeted effectively towards the groups most severely affected. There are two areas in particular in which solving data gaps could improve our understanding of COVID-19.

 Household level data is not keeping pace with individuals

Household measures of income and earnings have traditionally been less timely than measures for individuals and this formed a key area of our findings in the work highlighted above. With respect to COVID-19, there is interest in understanding how the Government’s income support measures have impacted income for different household types such as those with children or lone parent households. Even in households which are not receiving any income support, people may have had to adapt their working patterns to share the responsibility of childcare which may lead to one or both of the earners in a household working reduced hours on potentially reduced pay. HMRC has published data which shows that 9.1 million jobs had been furloughed by mid-June but we won’t see any contextual data about the impact on households until 2022 in the Family Resources Survey. We hope the relevant statistical teams explore new ways to deliver this insight in the meantime.

 There are lots we don’t know about the world of the self-employed and business owners

It is notoriously difficult to capture information on the income and earnings of the self-employed or those who own businesses. This is because many earn less that the taxable allowance so are not captured in statistics relating to income tax and many don’t have predictable earnings so we don’t know what they’ll earn until well after the year end. The surveys which do manage to collect information on the self-employed are less timely than those for employees. When the Chancellor announced the Self-Employment Income Support Scheme, it quickly emerged that more people would need the support than originally anticipated and that the eligibility criteria would need to be adjusted to reflect the various ways that the self-employed can pay themselves. Improving the timeliness and completeness of information on the income of the self-employed could help identify groups of individuals who currently fall through the gaps of eligibility for the income support schemes in place.