Regulatory guidance – Thinking about quality when producing statistics

Questions to guide thinking about quality

Statistics should be produced to a level of quality that meets users’ needs, and quality assurance (QA) should be proportionate to the nature of the quality issues and the importance of the statistics in serving the public good.

In this guide we provide some questions that analysts producing statistics can use in considering quality at each stage of production. This list is adapted from a series of questions we asked teams during our review of the principles and processes underpinning the quality of HMRC’s official statistics, and draws on QAAD Questions which help producers find out about administrative data. This quality guide is not a checklist but is designed to be used alongside your own organisation’s guidance, as well as alongside external resources – check out the list at the end. GSS guidance on best statistical practice for quality assurance is available on the Policy and Guidance Hub.

Quality is one of the three pillars of the Code of Practice for Statistics. It means that statistics fit their intended uses, are based on appropriate data and methods, and are not materially misleading. It requires skilled professional judgement about collecting, preparing, analysing and publishing statistics and data in ways that meet the needs of people who want to use the statistics.

Understanding the production process

What are the steps in your statistical production process, from acquiring the data to final statistics? Can you map out the “data journey”? Why is it done in this way?
Where are the highest risk points for errors in the process? What measures do you or could you take to mitigate risk at these points?
How much time does your statistics output take to produce and how is this time split between data collection, analysis, report preparation and QA? Does the current balance feel effective?
How do you know that the statistics output is ready? Who is responsible for final sign off?

Tools used during the production process

What analytical tools do you use during the production process? Are they the best for the job?
How many manual steps are there in the process (e.g. updating cells in spreadsheets, moving data between software or copy-paste steps)? Could these be reduced to minimise the risk of error?
Are any parts of the publication process automated? If so, how do you ensure that these are correct and can be inspected and understood by other staff or new members in the team?

Receiving and understanding input data

When and how do you communicate with your data provider(s)?
Does your data provider have a good understanding of how and why you are using their data?
Is there a formal agreement in place that specifies when, what and how the data will be received? If not, do you think this would be helpful?
Do you know what quality checks are carried out on the data before you receive them?
How do you work with your data provider when your data requirements change?
How do you know if your data provider makes a change to their systems or processes, which could impact the data you receive and/or the statistics you produce?
What are the strengths and limitations of the data used in your publication? Are these communicated to people using your statistics?

Quality assurance

What do you feel is done well with regards to QA in your team? What could be better?
How do you ensure that input data are correct and in the expected structure and format?
How do you assure yourselves that analysis carried out is correct?
If you find anomalies or unusual trends in the data, what steps are taken to investigate them?
Is your code or analysis ever peer reviewed by someone outside your team or organisation?

Version control and documentation

How do you ensure that analysis is auditable and can be inspected and understood by colleagues?
Could you reproduce the analysis and output from a previous publication?
If changes need to be made to any code or analysis, how are these documented? Are changes checked by another member of the team?

Issues with the statistics

What happens if you find a mistake in the data/your publication? How is it rectified? Is your approach consistent with your department’s statistical revisions policy?
What steps would you take to minimise the chance of a similar error happening again?

« Previous