Quality and statistics: an OSR perspective

The Office for Statistics Regulation (OSR) is the independent regulator of official statistics produced in the UK. Our vision is that statistics should serve the public good. Ensuring public confidence in the quality of statistics is an essential part of achieving this.

In this paper we set out how we in OSR think about quality, the challenges producers face when communicating quality, and the environment and behaviours within organisations that support quality. Finally, we outline how our thinking feeds into our current regulatory approach.

The context in which OSR considers Quality: TQV

Our understanding of quality is captured in the Code of Practice for Statistics, which sets the standards that producers of official statistics should commit to.

Quality relies on having data and methods that produce assured statistics. This means that statistics fit their intended uses, are based on appropriate data and methods, and are not materially misleading. It requires skilled professional judgement about collecting, preparing, analysing and publishing statistics and data in ways that meet the needs of those who want to use the statistics.

However, there are three pillars in the Code: Quality sits between Trustworthiness, representing the confidence users can have in the people and organisations that produce data and statistics, and Value, ensuring that statistics support society’s needs for information. All three pillars are essential for achieving statistics that serve the public good. They each provide a particular lens on key areas of statistical practice that complement each other.

Quality is not independent of Trustworthiness and Value. A producer cannot deliver high quality statistics without well-built and functioning systems and skilled staff. It cannot produce statistics that are fit for their intended uses without first understanding the uses and the needs of users. This interface between quality, its institutional context and statistical purpose are also reflected in quality assurance frameworks, including the European Statistical System’s QAF and the International Monetary Fund’s DQAF. The Code of Practice is consistent with these frameworks and with the UN Fundamental Principles of Official Statistics.

The Code sets out the standards that producers should follow to ensure that the public can have confidence in their statistics. It is not a rule book but a guide; considering any situation through the prism of the three pillars provides the basis for answering the challenges that producers and analysts face. It relies on having a mindset open to how the statistics can be wrong and a culture that prioritises quality management.

Quality is not a static characteristic but dynamic: producers need to remain active in their monitoring of sources and quality indicators, consciously looking for changes to systems, methods, policies, legislation, and other factors that can have an impact on the nature of the data and statistics and change over time.

Breaking down the concepts of ‘Quality’

The three principles within the Quality pillar are:

  • Suitable data sources
  • Sound methods
  • Assured quality

The following explanations explore what we are looking for when we regulate statistics, to ensure quality, and public confidence in quality.

Suitable data sources

Suitable data sources means that statistics should be based on the most appropriate data to meet intended uses. The impact of any data limitations for use should be assessed, minimised and explained.

It can be easy when producing statistics to trust what you are given; but rather than unquestioningly accepting data, selecting and using a data source should be an active decision. The principle of Suitable data sources reminds producers to gain a good understanding of the nature of data they are using (or planning to use), and to establish and maintain good relationships with supply partners where possible. This is a particular challenge for large producer organisations, requiring good metadata systems, collaboration, and cross-team ways of working that allow mutual sharing of insight about data.

Producers should also ensure that the data are an acceptable match for what is required. They need to be clear sighted and open on any limitations, and work to minimise the impact of these limitations. Suitable data sources emphasises the ongoing need to understand the suitability of the data, rather than viewing the selection as fixed.

Sound methods

Sound methods calls on producers to use the best available methods and recognised standards, and to be open about the reasons for their decisions.

The method can reflect the most advanced local practices or established international agreement. In the absence of these, it calls for methods to have a scientific foundation or, at the very least, established professional consensus. ‘Best’ doesn’t mean ‘perfect’, but it does have to have a sound basis.

The principle emphasises the importance of statistical harmonisation and transparency about methods. Producers should be open on what they are providing and why, being clear on whether they are coherent or not with related statistics and classifications. They should give a steer on how the statistics can be used and, if necessary, how they can’t. They should provide a proportionate explanation of the limitations and uncertainty in the statistics, helping users understand the nature and implications of potential sources of bias. More complex situations and methods will require more explanation than straightforward approaches: it is not a one-size fits all approach but should be tailored to need and to the nature of the audiences.

Statistical methods should also remain a live choice, and not be seen as immutable. Producers should be actively reviewing method choices and alert to emerging alternatives and possibilities. Working closely with external experts and other producers provides rich opportunities for learning new approaches or identifying potential issues.

Assured quality

Assured quality requires producers to explain clearly how they assure themselves that statistics and data are accurate, reliable, coherent and timely. The emphasis is on ensuring producers’ and users’ confidence in the quality of the statistics, that the statistics are fit for their intended uses.

Quality assurance is not an add-on, just a final stage of checking tick boxes before publication. Instead, it should be seen as an ongoing process throughout the development and production of statistics to build producers’ own understanding and confidence, which can then in turn reassure users. Organisations that effectively manage the quality of the data they produce and use have well-designed and managed systems and processes, well-established relationships between partners, and actively promote consistent quality standards and values. The operation and credibility of any statistical organisation is risked when quality management of data is not prioritised.

Assured quality is about identifying, anticipating and avoiding the problems that can arise from data inputs or the processes used to calculate statistics in an effective and efficient manner. It should be proportionate to the nature of the quality issues and the importance of the statistics in serving the public good, but all statistics producers need to be curious, and not take data at face value.

It is helpful to consider the relevant DAMA quality dimensions when testing the input data, such as completeness, uniqueness, consistency, timeliness, validity, accuracy, and user needs and trade-offs. It’s also helpful to assure the statistical output against the ESS quality dimensions: relevance, accuracy and reliability, timeliness and punctuality, coherence and comparability, and accessibility and clarity. (The Code of Practice places relevance and accessibility and clarity within its Value pillar as we see these as key to considering user needs and ensuring the statistics are useful and usable.)

Communicating Quality

Transparency and effective communication are common threads running through the three principles outlined above.

One of the key challenges producers face is how to effectively communicate about the quality of their statistics. Often, producers provide extensive information about methods and sources, but then do not say how good (or not good) are the statistics in relation to their expected uses. Users can be left unclear on what the information provided means for using the statistics. Some users have told us during our designation review that they would like prominent, succinct quality information, which helps them decide whether the statistics are suitable for their own use.

There can be a wariness among analysts to being open about quality issues, believing that this will undermine the confidence of users in the statistics and the producers. Low quality doesn’t equate to poor performance (although that can occur). We would like to see producers being more confident in showing the great work they have done to produce useful statistics in areas that would otherwise miss out.

As highlighted in our recent review of approaches to presenting uncertainty in the statistics system, giving straightforward information about uncertainty, and being open when things go wrong, builds confidence in you as a trustworthy organisation. While there are examples of statistical producers communicating uncertainty in a way that is clear and understandable to non-expert users, our 2021/22 State of the Statistical System report, which looked across findings from all our regulator work over this period, emphasised that there is more that could be done by many producers to communicate uncertainty around estimates in a way that brings effective insight.

Users often say any data is better than no data. We see value in having more timely data that may be of lower accuracy when there is a clear public interest that can be met. This can be a source of anxiety for analysts who feel the compromise undermines the integrity of the statistics. It is a judgement call for producers to make on how to ensure that quality is sufficient for the appropriate use of the statistics. Balancing timeliness and accuracy rely on good engagement with users. Being clear about your confidence in the quality and value of the statistics and how you have come to your choice can provide reassurance to those with concerns and an opportunity for healthy challenge.

Publishing data and statistics that materially mislead is unacceptable. If the statistics do not bear the weight of the decisions made using them, producers should immediately review whether they should continue to publish them or give clearer guidance to protect against inappropriate use. Statistics can become materially misleading estimates of what they aim to measure when they are based on unsound sources, use inappropriate methods, or where producers don’t have appropriate quality checks.

A culture that supports Quality

The wider organisational context, or ‘quality culture’, in which statistical producers work will always impact their ability to produce statistics that are of appropriate quality.

The Code encourages organisations producing official statistics to be open about their commitment to quality. This means actively promoting appropriate quality standards and values, reflecting their approach to quality management. They should use well-designed and managed systems and processes, appropriate tools, and have well-established relationships between partners.

We recommend producers encourage a mindset that emphasises a focus on quality, that is open to seeing how their statistics could be wrong without blame. A culture of quality encourages honesty and openness to learn from errors and near misses to strengthen producers’ systems. It supports innovation and creativity in finding new sources and solutions, to produce statistics that are relevant and useful. It builds professional judgement and confidence, to provide clear quality statements that users need. It focuses on delivering statistics in which users can have confidence.

To meet this principle, and thereby to enable effective management of risks to quality, the roles and responsibilities of all those involved in the production of statistics should be clearly defined. Managers who provide clear expectations with respect to quality, as well as guidance and support, will better enable junior producers to understand and carry out their roles, minimising the risk of error. Senior leaders should prioritise quality management, investing the necessary resources to promote and support developments and innovations that can enhance the quality of official statistics. They also need to act as advocates for official statistics to ensure that they are valued across their organisation.

How does our perspective influence OSR’s regulatory approach?

Our regulatory approach

We use assessments and compliance checks to judge compliance with the Code of Practice for Statistics for individual sets of statistics or small groups of related statistics and data (for example, covering the same topics across the UK). Whether we use an assessment or compliance check will often be determined by balancing the value of investigating a specific issue (through a compliance check) versus the need to cover the full scope of the Code of Practice (through an assessment).

There is no ‘typical’ assessment or compliance check – each project is scoped and designed to reflect its needs. An assessment will always be used when it concerns a new National Statistics designation and will also be used to undertake in-depth reviews of the highest profile, highest value statistics, especially where potentially critical issues have been identified.

There is no absolute level of quality, especially accuracy, that distinguishes National Statistics from other statistics. This depends heavily on the use made of the statistics, the availability of other similar data on the same topic that can be used alongside the official statistics, and possibly other factors. It is therefore not for OSR to judge a level of accuracy that is ‘acceptable’. Instead, our judgements are based on hearing directly from producer teams on how they produce the statistics and their views about the challenges and opportunities they face. We consider evidence from users and other stakeholders, particularly in relation to any views about the suitability of the methods and presentation and how well information needs are met. We are also alert to the primary reasons for the collection and release of the statistics, as well as their wider public benefits. We are currently evolving our regulatory approach with a ‘Quality-heavy’ assessment, that provides an in-depth review against the Quality pillar.

OSR guidance

We have some useful guidance that can assist producers in their quality management. We published a guide Thinking about quality when producing statistics following our in-depth review of quality management in HMRC. We released a blog to accompany our uncertainty report that provides a useful collection of answers to questions posed by analysts working in government in our recent webinar on uncertainty. It highlights some important resources, top among them the Data Quality Hub guidance on presenting uncertainty. Our Quality assurance of administrative data (QAAD) framework is a useful tool to reassure users about the quality of the data sources.

New developments

To support statistics leaders in developing a strategic approach to applying the Code pillars and a quality culture, we have developed a maturity model, ‘Improving Practice’. It provides a business tool to evaluate the statistical organisation against the three Code pillars and helps producers identify the current level of practice achievement and their desired level, and to formulate an action plan to address the priority areas for improvement for the year ahead. This tool is currently being piloted by five producer bodies.

We are also currently developing a quality grade tool to support producers in rating the quality of their statistics and in providing a succinct statement for users.

And we are continuing to promote a Code culture that supports producers opening themselves to check and challenge as they embed Trustworthiness, Quality and Value, because in combination, the three pillars provide the most effective means to deliver relevant and robust statistics that the public can use with confidence.

Our understanding about quality continues to evolve as we carry out our regulatory work and we would welcome hearing your thoughts – please share any reflections by emailing regulation@statistics.gov.uk.

Statistical Literacy – it’s all in the communication

In our regulatory work, when people talk to us about statistical literacy it is often in the context of it being something in which the public has a deficit. For example, ‘statistical literacy’ may be cited to us as a factor in a general discussion on why the public has a poor understanding of economic statistics.

But is it a deficit that can or needs to be addressed and – more importantly – what actually is statistical literacy?

To help answer these questions, as well as looking across our regulatory work and talking to other researchers, we commissioned a review of published research on this topic area. The review was carried out by Jessica McMaster, via a Cambridge Grand Challenges Internship, in the period March to September 2022.

Read the full research review

This think-piece and the accompanying research review are our first publications specifically on this topic area. We hope that this evidence base will provide a useful resource for others who are working in this area.

What is statistical literacy?

Though the concept of statistical literacy has been discussed and researched for many decades, we found no consensus on what it means. Instead the term is being applied in different, often unrelated, contexts. The two most common contexts are education (for example, in the context of a school or university setting), and adults as data consumers (for example, in the context of explaining the ability of an adult to form a judgement when presented with a statistics).

Though context is one factor in the different definitions for statistical literacy, we found that there are other common components. These include foundational abilities – such as numeracy and overall general literacy, knowledge of statistical concepts, and an ability to critically evaluate statistical information.

As there is no common understanding of statistical literacy, we considered whether the use of the term is helpful, or whether instead it may be hindering action to address specific issues affecting the public’s ability to understand statistics.

We concluded that statistical literacy can be a useful term, but that it should be used consciously – anyone using the term should define what they mean by statistical literacy, considering the context that the term is being applied and the factors that they consider important in that context.

What is the general public’s level of statistical literacy?

Given that the concept of statistical literacy varies, it should come as no surprise that when carrying out this review we did not find a definitive measure of the general public’s statistical literacy.

What we did find was wide variability across the general public in the skills and abilities that are linked to statistical literacy. Our review highlights that a substantial proportion of the population display basic levels of foundational skills and statistical knowledge, and that skill level is influenced by demographic factors such as age, gender, education and socioeconomic status.

Given this, we think that it is important that statistical literacy is not viewed as a deficit that needs to be fixed, but instead as something that is varied and dependant on the context of the statistics and factors that are important in that context.

Therefore, rather than address deficits in skills or abilities, we recommend that producers of statistics focus on how best to publish and communicate statistics that can be understood by audiences with varying skill levels and abilities.

How should statistics be communicated?

Our review identified a number of areas where there is good evidence on how best to communicate statistics to non-specialist audiences. This evidence aligns not only with the principles of the Code of Practice for Statistics, but also with what we understand others have found. We hope the evidence in our review will help to reinforce and support current and future wider work on this topic.

Our review found good evidence to endorse existing features of best practice in communicating statistics, in the following areas:

Target audience: Our evidence endorses the widely recognised importance of understanding audiences. The evidence highlights that the best approach to communicating information (including data visualisations) can vary substantially depending on the characteristics of the audience for the statistics. Considering the target audience’s characteristics is, therefore, an important factor when designing communication materials.

Contextual information: Contextual information helps audiences to understand the significance of the statistics. Our evidence highlights the importance of providing narrative aids, and also that providing statistical context can help to establish trust in the statistics. Again, this supports and reflects existing notions of best practice.

Establishing trust: As well as providing context, we found evidence that highlighting the independent nature of the statistical body and, when needed, providing sufficient information so that the reasons for unexpected result are understood, can increase trust in the statistics. This finding aligns with the Code of Practice for Statistics, which includes Trustworthiness as one of its three pillars.

Language: In the statistical system, statistics producers recognise that they should aim for simple easy to understand language. We found evidence to endorse this recognition – in particular, that, when used, the level of technical language should be dictated by the intended target audience.

Format and framing of statistical information: We found evidence that different formats (e.g. probability, percentage or natural frequency) and/or framing (e.g. positive or negative) in wording can lead to unintended bias or affect perceptions of the statistics and both need to be considered. This finding is probably the one which is least widely recognised in current best practice in official statistics, and we consider it is an area that would benefit from further thinking.

Communicating uncertainty: Communicating uncertainty is important and may need to be tailored dependent on the information needs and interest levels of the audience. This topic is a particular focus area for OSR. In December 2022 we published a report that looks at how uncertainty is currently being communicated in official statistics, the current guidance that is available and our views on how communication of it can be improved.

Next steps

We will be continuing our work in 2023/24 under our business priority ‘champion the effective communication of statistics to support society’s key information needs’. Included in this is our continued research focused on understanding what statistics serving the public good means in practice.

We have existing links with a number of researchers and organisations working on how to communicate statistics and are very open to working with others who have an interest in this topic.

Please get in touch with us at regulation@statistics.gov.uk if you would like to discuss any aspect of this think piece or our review with us.

The Public Good of Statistics: What we know so far

To succeed in our aim to develop a better understanding of statistics serving the public good, it is critical to understand what is already known, and what is not known, about this subject.

This review contributes towards that aim. We begin by considering how the public good is defined. This phrase is sometimes used interchangeably with other similar phrases (e.g. public interest) but it is not well understood how appropriate this is (if other phrases do mean the same thing) or what the public good really means for the public themselves. We then outline four approaches to measuring and understanding the public good of statistics, which are discussed below.

Legislative Approach

The legislative approach provides an overview of two key pieces of legislation which are relevant to statistics serving the public good. The Statistics and Registration Service Act (2007) led to the creation of the UK Statistics Authority and it also created a definition of the public good. The Digital Economy Act (2017) then created mechanisms to promote data sharing and linking which further contributes towards statistics being able to serve the public good.

Empirical Research

Empirical research is relevant to the question of whether statistics are currently serving the public good. Two important themes are highlighted: trust in statistics and statistics producers, and the communication of statistics. Evidence suggests that these two issues may be instrumental in ensuring that statistics can serve the widest range of users possible, therefore further research is needed to better understand these two factors.

Economic Value

The review also considers how the economic value of statistics can serve the public good. This highlighted the need for measurements which can quantify the value of statistics. Being able to quantify the value of statistics would help to demonstrate the need for national statistical offices and may provide further support for the development of high-quality statistics. This section also discusses the need for more timely statistics on economic measures.

Social Value

The review considers the social value of the public good of statistics by discussing the impact of data gaps on statistics serving the public good. Further to this, we consider the difficulties associated with ensuring that there are no gaps in data. We also evaluate whether the approach taken by the BBC to provide a valuable service to the public can offer insights and possible comparisons to OSR’s approach to the public good.

In conclusion, our review highlights several points where further research is needed to shed light on the important issue of statistics serving the public good.

Through looking at this issue across four different approaches, we can build a picture of how this concept operates in various methodologies, disciplines, and organisations. But this is just a starting point for the research programme. We hope to use these insights to guide our future work so we can continue to develop our understanding of what it means for statistics to serve the public good.