Thinking about quality when producing statistics

Quality means doing it right when no one is looking.” – Henry Ford


Official statistics inform government, the media and the public about the issues that matter most in society. To feel confident using official statistics, people must trust them: quality has an important part to play in earning this trust.

In April, we published a review of the quality of HMRC’s official statistics. HMRC invited us to carry out this review after identifying a significant error in one of its published National Statistics. The review provided an independent assessment of HMRC’s quality management approach and identified improvements to strengthen the quality of their official statistics.

We made nine recommendations, which HMRC has welcomed. Many of the recommendations will apply to other producers – not just to strengthen the quality of official statistics, but also to improve the quality of all analytical outputs.

This blog tells the story of the review and its findings, from the perspectives of HMRC and OSR. We hope to inspire other producers to think about how they can build on their own approach to quality, to ensure statistics meet the needs of the people who use them.

Jackie Orme, Programme Lead, HMRC

In 2019 HMRC identified an error in published corporation tax receipt statistics, which led to us having to make substantial revisions. This was a serious concern both internally for HMRC and for external users of HMRC statistics. In response we undertook a number of actions, including initiating an internal audit review and inviting OSR to review the principles and processes underpinning production of our official statistics.

The review by OSR was particularly important to us as statisticians and analysts in HMRC, to draw on expert and independent advice in improving our ways of working. While some of the findings could potentially be uncomfortable, the review would support our desire to take a broad and ambitious approach to improvement and the weight of OSR’s views and advice would give credence to the need for change.

The review was carried out efficiently and we were kept well-informed about progress. The OSR review team devoted lots of time to talking to staff and stakeholders to get their input and views, across all grades and professions. This level of involvement has been helpful to us subsequently in securing initial engagement and agreement to changes across the organisation. For example, in getting active support from senior HMRC leaders to implement recommendations, such as creating a new cross-cutting team as part of our analysis function to build on our existing approach to data quality and assurance.

The review has given us the opportunity to reflect on data quality issues and the importance of having robust data to produce high quality statistics and analysis. We have built a substantial programme of work to implement the recommendations and are starting to recruit people to the new team. Some recommendations will be straightforward to implement. For example, we have already started to review our statistics outputs, in order to make sure analytical resource is being used effectively.

In contrast, other recommendations are more challenging to implement, in particular, mapping the journeys of our data within the department. This will take significant combined effort by analysts, data providers and data processors.

As highlighted in the report, HMRC has some older systems for processing and storing its administrative data and the review has been helpful in emphasising how essential it is for analysts to be involved in discussions and decisions around the design of future systems. These sorts of insights from the report have helped us build a case for increased resource and forge stronger links with data providers, to work together to improve the quality of HMRC’s statistics and analysis.

Helen Miller-Bakewell, Project Manager, OSR

We were really pleased when HMRC asked us to do this review: in doing so, it showed a proactive and open approach to strengthening the quality of its official statistics.

It’s the first time we’ve done a piece of work that looks across all of a producer’s official statistics at once – although we have now done something similar with the Defra Group (The Department for the Environment and Rural Affairs and its agencies and public bodies), with a focus on user engagement. Normally, we look at one set of statistics in detail, or we review how statistics on a topic area come together to meet user needs. This was somewhere in the middle!

To inform the review, we spoke with a wide range of people involved in the production of official statistics in HMRC; analysts working on the statistics directly, managers who oversee them and a handful of people indirectly involved in the production process, who own and supply data.

The OSR team spent about an hour with each individual or team we interviewed, during which we asked lots of questions about the production process. This helped us to understand how the quality of statistical outputs was managed in HMRC, and the challenges analysts can face.

It turned out to be a useful process for the producer teams as well, and we were asked for our question list a couple of times, to help them think about the quality of their statistics in the future. We’ve now packaged up this question list in a published guidance document, so that all producers can benefit from it.

The findings of the review highlight the issues that big operational departments working with administrative data can face with respect to quality and will ring true for other Government departments. The recommendations stress the importance of analysts fully understanding the nature and quality of data they are working with, and of building effective working relationships with data providers or managers to facilitate this.

In addition, OSR champions a broad approach to quality assurance of data and statistics, and regular reviews of publications to ensure analytical resource is being used effectively. The report emphasises the importance of having analytical leaders that champion and support changes and innovations that can enhance quality, while recognising that analysts do not operate in isolation and that long-term improvements to quality management rely on understanding, values and responsibility being shared across organisations.

We’re pleased the review has been so helpful to HMRC. We would like to thank everyone who gave their time to speak with us during the review. Their cooperation and openness were key to us arriving at findings that resonate with analysts working in HMRC and recommendations that will have a lasting positive impact on the quality of HMRC statistics.

Rising to the challenge

One thing that has stood out in the Coronavirus pandemic is how quickly the Government Statistical System (GSS) has responded to the need for data to support important decisions made by the government during this timeIn just a matter of weeks, statisticians have repurposed existing statistics and have made use of new data sources. Before the crisis, this work might have taken months. HMRC’s Coronavirus Job Retention Scheme (CJRS) statistics and its Self-Employed Income Support Scheme statistics (SEISS) are among theseWrecently conducted a rapid review of these statistics and Ed Humpherson, Director General for Regulation, has written to HMRC’s Head of Profession for Statistics, Sean Whellamssupporting the approach taken to produce these statistics. 

In March this year, we wrote about how our assessment of employment and jobs statistics in effect captured the statistical world in miniature, microcosm, of the statistical system. The assessment surfaced many of the common issues that statistics producers face, highlighting recurring themes from our other regulatory work. Now we see a further glimpse of our statistical world in miniature, through the lens of our recent review of HMRC’s statistics. HMRC’s response to the need for information about these crucial schemes admirably demonstrates government statisticians rising to the key challenge of the times.  

There are two aspects of the statistical system which these statistics exemplify.

First, for us as statistical regulators, whether during a national crisis or otherwise, government statistics should (ianswer society’s questions in a timely way; and (ii) provide insights at a level of detail to be useful to users. Additionally, many questions cannot be answered without sharing and linking data. In the preparation, production and publishing of the CJRS and SEISS statistics, HMRC has displayed all of these elements.  

Naturally society’s questions have been about what the take-up of the schemes and the costs of the job protection schemes to the public purse. These interests reflect two essential aspects of the Government’s job protection schemes  speed and simplicity of support. CJRS was launched on 20 April and SEISS on 13 May. Initially HMRC tweeted daily information on both schemes – in CJRS information about the number of unique applicants and the number of furloughed jobs In SEISS, HMRC gave information about the total number of claims. HMRC tweeted also the value of the claims received in respect to both schemes. As claims to the schemes started to tail off HMRC moved to tweeting the data on a weekly basis. Releasing these statistics on a timely basis and at intervals that met the needs of users is a good example of the orderly release of statistics essential to building trust in new statistics. 

After just a few weeks of tweeting this important data, HMRC linked both the CJRS and the SEISS data with other pre-existing HMRC administrative data to provide further insights into CJRS claims by employer size (for SEISS breakdown of claims by age and gender)and breakdowns for both schemes by sector of the economy, and by geography. These new statistical breakdowns were published in statistical bulletins released less than two months after the launch of the CJRS and under a month after the launch of SEISS  quite  remarkable achievements.  

Second, we found HMRC to be working closely with users of this data to find out what they need to know from the statistics. HMRC is open with users about any aspects of uncertainty in its estimates labelling the statistics and analysis with a frank description of what the statistics summarise. Consistent and coherent standard geographic definitions are adopted to harmonise with related statistics and data. In explaining the methods used to produce the statistics, HMRC has been proportionate to the complexity of the methods themselves, reflecting the needs of different types of users and uses. 

In a further example of a statistical system working well, we as statistical regulators look for statistical producers to go beyond making statistics and data available fast but to also present clear, meaningful, and authoritative insights that serve the public good. There are many examples of how HMRC has done this but to select just one in the SEISS bulletin HMRC set out the numbers of self-employed people across the UK who are eligible for support. This information is published not only at the UK level but also iaccompanying tables at regional and sub-regional (local authority and parliamentary constituency) levels. As we pointed out in our Monitoring Review of Income and Earnings in 2015, timely data on self-employment income is a key data gap generally. These statistics help understand just a little bit more about the income and earnings of the self-employed at different locations around the UK in 2020 and are a step towards addressing this data gap. 

Normally when statistics producers publish new or innovative statistics, they have a period in which they can develop and improve the statistics. HMRC will only have a restricted period in which to do this for its CJRS and SEISS statisticsBy their nature, these schemes are temporary and we will probably only see small number of releases.  Quite how the statistics may change, for example with the advent of the Coronavirus Job Retention Bonus Scheme, is yet to be established. Society will pose further questions for statistics producers to answer. 

The times we live in call for an ongoing watchfulness, a continuing need to be agile and responsive to offer even further insight in the months and years going forward. What we found in this case study is replicated throughout the statistical system. With continuing watchfulness there’s little doubt that there will be further statistics, data and insights to help manage to the other side of this pandemic.