We are, apparently, living in a world awash with data. Data are available in ever greater quantity and with ever greater frequency. Yet alongside this abundance, there is a realisation that data gaps still exist. “What is the impact of the COVID-19 pandemic on people, households and communities in Great Britain?” This is a question that has been asked a lot over the last 6 months, and the UK’s government statisticians have been answering it. In doing so, they have been filling a data gap. Yet addressing data gaps can perhaps sound a bit daunting to statistical producers.

At OSR, much of our work involves uncovering and helping statisticians focus on addressing data gaps. Whether it’s statistics about adult social care, or post-16 education and skills, our work often highlights ways in which user needs are not being met by current statistical products.

This has long been the case. What’s new is that we are starting to pull together insight from all our activities into the question of data gaps.

It’s early days for this cross-cutting analysis. Our intention in this blog is to start the process of demystifying data gaps, building on our growing understanding of how different statistical producers have addressed them.

Raise your awareness

To make sense of data gaps, first producers must become aware of them. This awareness is not always easy for people who are responsible for collecting data inside a system. It’s easy to focus on the data that you have, and how to make use of it. Yet data gaps are, by definition, about the data that you don’t have.

This is where user feedback is so crucial. We have found that data gaps are largely identified by groups of users or social discourse, rather than by those responsible for compiling the statistics. Gaps in information can also be thrown open by external events. For example, the COVID-19 pandemic or the Grenfell Tower tragedy have both stimulated demand for new statistics, creating gaps that didn’t previously exist.

Understand the nature of the gap

The next important step is to understand the nature of the gap. Is it a problem with the way that data are defined, collected, or presented – so that there is an intrinsic weakness in the dataset? If so, it might take substantial commitment and time to address, and so the most important thing for a producer to do immediately is to highlight the gap and be honest about the limitations of the data. For example, in England and Wales, deaths should be registered within five days of the death occurring, but there are some situations that result in the registration of the death being delayed[1].

Address the gap

Beyond awareness and understanding, producers have shown that they can address gaps – they can provide data and analysis to fill in the gaps in people’s understanding. ONS has done this repeatedly during the pandemic by publishing statistics from a new household survey to estimate the number of current positive cases of COVID-19 in the community[2], as have statistics producers in Wales[3], Scotland[4] and Northern Ireland[5].

We are still building up our case examples of what works to address gaps, but one striking early conclusion is the importance of collaboration between different producers. Over recent years, we are pleased to keep writing that departments are innovating in data collection or processing methods to fill gaps e.g. web scraping (ONS using scanner data for estimating CPIH) or linking data (ONS developing the UN Sustainable Development Goals) or sharing data (Ministry of Justice developing person-level records [6] ). Producers also show continuous innovation to plug data gaps e.g. publishing experimental statistics and management information and voluntarily applying the Code of Practice[7].

This is just the start of our work on data gaps. We want to build a much richer evidence base on where gaps arise, and the crucial steps taken by those who successfully address them.

But one thing is clear to us: where statistical producers show openness and appetite to improve statistics, listen closely to their users and work together, they have a much better chance of addressing gaps than if they prefer to hunker down into their existing data, analysis and ways of doing things.

With the right leadership approach, and an open mindset, there’s no need to be daunted by data gaps.



[3] https://gov.wales/nhs-activity-and-capacity-during-coronavirus-covid-19-pandemic-10-september-2020

[4] https://www.publichealthscotland.scot/our-areas-of-work/sharing-our-data-and-intelligence/coronavirus-covid-19-data-and-guidance/

[5] https://www.health-ni.gov.uk/covid-19-statistics

[6] https://osr.statisticsauthority.gov.uk/wouldnt-it-be-cool-if/

[7] https://code.statisticsauthority.gov.uk/list-of-voluntary-adopters/