The Public Good of Statistics: What we know so far

To succeed in our aim to develop a better understanding of statistics serving the public good, it is critical to understand what is already known, and what is not known, about this subject.

This review contributes towards that aim. We begin by considering how the public good is defined. This phrase is sometimes used interchangeably with other similar phrases (e.g. public interest) but it is not well understood how appropriate this is (if other phrases do mean the same thing) or what the public good really means for the public themselves. We then outline four approaches to measuring and understanding the public good of statistics, which are discussed below.

Legislative Approach

The legislative approach provides an overview of two key pieces of legislation which are relevant to statistics serving the public good. The Statistics and Registration Service Act (2007) led to the creation of the UK Statistics Authority and it also created a definition of the public good. The Digital Economy Act (2017) then created mechanisms to promote data sharing and linking which further contributes towards statistics being able to serve the public good.

Empirical Research

Empirical research is relevant to the question of whether statistics are currently serving the public good. Two important themes are highlighted: trust in statistics and statistics producers, and the communication of statistics. Evidence suggests that these two issues may be instrumental in ensuring that statistics can serve the widest range of users possible, therefore further research is needed to better understand these two factors.

Economic Value

The review also considers how the economic value of statistics can serve the public good. This highlighted the need for measurements which can quantify the value of statistics. Being able to quantify the value of statistics would help to demonstrate the need for national statistical offices and may provide further support for the development of high-quality statistics. This section also discusses the need for more timely statistics on economic measures.

Social Value

The review considers the social value of the public good of statistics by discussing the impact of data gaps on statistics serving the public good. Further to this, we consider the difficulties associated with ensuring that there are no gaps in data. We also evaluate whether the approach taken by the BBC to provide a valuable service to the public can offer insights and possible comparisons to OSR’s approach to the public good.

In conclusion, our review highlights several points where further research is needed to shed light on the important issue of statistics serving the public good.

Through looking at this issue across four different approaches, we can build a picture of how this concept operates in various methodologies, disciplines, and organisations. But this is just a starting point for the research programme. We hope to use these insights to guide our future work so we can continue to develop our understanding of what it means for statistics to serve the public good.


Unlocking the value of data through onward sharing

About this guide

We have written this guidance to increase awareness among statistics producers and users that the principles of the Code of Practice extend beyond statistics production to data sharing and access. We outline practices and processes that uphold these principles. Specific guidance about how to meet these expectations is signposted where available.

The central purpose for all official statistics producers is serving the public good through the provision of data and statistics. This obligation is reflected in the principles of the Code of Practice for Statistics which requires statistics producers to commit to, and to promote, the safe onward access to the data used as the basis for producing official statistics. These may include, for example, data from the census, population and business surveys, as well as administrative records.

This guidance is a companion to our guidance on data governance: building confidence in the handling and use of data, which supports data sharing for the public good. It is aimed at Heads of Profession for Statistics and analysts working in producer bodies with an interest in data linkage and sharing.

“Data is more useful when more people can access and use it. It is most useful when it can be joined together. Data that is inaccessible – or where access takes so long it is rendered irrelevant – is of limited utility. Jeni Tennison, CEO of the Open Data Institute[1]


Guidance for producers when making changes to statistical methods

About this guide

This guide sets out examples of the principles in the Code of Practice that producers need to adhere to in order to remain code compliant when making changes to statistical methods. It also includes examples of the kinds of materials that can be used to help document this adherence. An important decision will be on the scale and nature of the changes that are being considered and to be proportionate in applying these principles – you need to determine the materiality of the change. Identify if there are fundamental changes to the methods that could affect the statistics – these will need to be understood and explained to users.


Mental Health Statistics in England

Attitudes towards mental health have changed in recent years. Mental health, which was often stigmatised and not discussed openly, is receiving increasing public, media and government attention as an important public health issue. There is a greater awareness that mental health is something we all have and, just like physical health, it can sometimes be good and sometimes be poor.

Our review of mental health statistics in England, carried out before the Covid-19 pandemic, explores why good statistics in this area are important, but is not intended to provide specific guidance on statistics directly related to the effects of the pandemic. We hope however, that sharing our findings on the strengths and weaknesses of mental health statistics, along with highlighting specific recommendations for improvements, will help inform decisions in the statistical sector both in the immediate term and going forward.

Our research for this review focused on answering the following two questions:

  • is the mental health statistical system publishing the information required to provide individuals, service providers and policy makers with a comprehensive picture on mental health?
  • do the existing statistics help answer the key questions about mental health in society today?

We spoke to a wide range of statistics users across different areas of society. They told us of their need for high quality statistics which are able to answer a broad range of questions. Users told us that the existing statistics did not paint a full enough picture of individuals and their conditions, and that producers should be taking greater steps to maximise the insight from existing statistics. In some areas they wanted to know more than the current statistics were able to tell them.

We heard that there is a need for improved quality across the datasets underlying many mental health statistics. Users told us that mental health statistics should be more accessible, both in terms of finding relevant publications and in relation to producers making publications easy to read and explaining clearly the limitations of the statistics. In addition to this, they spoke of their frustrations that some surveys were not carried out as often as they would like, as well as challenges around obtaining data for secondary analysis purposes.

Our research identified that, although the existing mental health statistics go some way to meeting user’s needs, there is much more that can be done.

Our recommendations:

  1. Statistics producers and organisations should exploit the value of the statistics through better data, greater analysis and linking data.
  2. We want to see continued activity to improve the quality of underlying statistics datasets, as well as clear communication with users about quality issues.
  3. We want to see clearer leadership and greater collaboration across producers of mental health statistics.
  4. Access to NHS Digital data needs to improve.

We understand that addressing these issues may not currently be a priority for statistics producers due to the COVID-19 situation, however we expect statistics producers to work collaboratively towards delivering these recommendations when they are able to do so.

Office for Statistics Regulation 5-year Strategic Business Plan

Statistics for the public good

This strategic plan was developed during the coronavirus pandemic. The experience of the pandemic has influenced our business plan in two ways.

Firstly, it has shown more clearly than ever the importance to the public of trustworthy, high quality, high value information. It’s not enough for good information to be available to decision-makers: for it to serve the public good, it must also be accessible, clearly explained and fairly presented to the public. Our work in the pandemic has involved stepping in to uphold these principles, and we have evolved our approach to support a fast changing statistical environment.

Second, the pandemic has shown us what the UK’s statistical system at its best can do: produce new statistics at great speed, using new and existing methods and data sources. We have highlighted the positive way the statistical system has stepped up, included in our July 2020 report on the state of the statistical system.

Embedding these developments, so that they become the norm, is the core ambition of this strategic plan.

We do not know all of the issues that are going to be at the centre of public debate over the next five years. But we do know that we will stand up for the public’s right to access statistics and data that exhibit trustworthiness, quality and value.

This plan accordingly does not map out a detailed set of deliverables for each of the next five years. Instead, it takes the four areas in the UK Statistics Authority’s strategy, and sets out what we are trying to achieve, including near term commitments, and medium term aspirations. And it sets out too what kind of regulator we want to be, using the maturity model set out on pages 12 to 27.

In summary, OSR needs to be agile, to focus on the interests of the public as users of statistics, and to continually develop our role as an independent regulator. This plan sets out how we aim to achieve these ambitions.

Ed Humpherson

Director General for Regulation

Office for Statistics Regulation Annual Report 2019-2020

Director General for Regulation Ed Humpherson’s Report

I look back at 2019/20 with a mixture of pride and an unfulfilled ambition
to do more. The pride comes from the achievements of the team at
Office for Statistics Regulation (OSR).

This report Annual Report outlines delivery of a huge range of activities – the highlights summary on page 6 conveys the range of outputs that the team has delivered: assessment of statistics that inform fundamental public debates like migration; high profile comments on the use of statistics by politicians on health, education, crime and the economy, including during a General Election campaign; and the voluntary adoption of the Code of Practice by a range of organisations.

We’ve not just delivered assessments of individual statistics. We’ve looked systemically too – at whole areas of policy like social care, and at underpinning concepts like the National Statistics designation.

And these activities have impact: throughout this report, you will read about OSR driving improvements in the coherence of statistics; in their quality; and in the publication of new statistics and data to inform public debate (for example, health funding, education funding, police numbers). This drive to ensure the public has the fullest picture of what’s going on has also been at the heart of our work during the COVID-19 pandemic.

And to understand the real source of my pride, as you read this report keep the following figure in mind: all this work is done by a team that numbers no more than 40 people. It’s an extraordinary achievement.

There are of course areas for improvement. The report by the Public Administration and Constitutional Affairs Committee highlighted the need to enhance our visibility and separation. This report outlines how we have addressed the Committee’s recommendations through a clearer public voice, better engagement with Parliament and a clearer relationship with the rest of the UK Statistics Authority.

Beyond these governance changes, we know that there is always more to do to ensure the public have access to the best possible data and statistics. Moving forwards, we will seek to pick up momentum in those areas where we did not fully deliver our plans in 2019/20, in particular, progressing our research programme to understand the public good of statistics and whether statistics reflect people’s lived experience.

Standing up for the public’s right to good statistics and data has of course meant we have been incredibly busy during the COVID-19 pandemic. Public access to trustworthy data has been one of the stories of the pandemic. My team has adapted brilliantly to this challenge. They have continued to deliver regulation while working from home. Their work has secured both improvements in the way data are explained and used, and the publication by Government of new datasets – and demonstrates an independent, dynamic regulator in action.

I hope that as you read this report, you can see that our work really matters. I hope you will see why I’m proud of the team’s achievements and our growing confidence. And I hope you will sense our continued, unrequited appetite to support the best possible statistics that serve the public good.

Ed Humpherson
Director General for Regulation
June 2020

Exploring the public value of statistics about post-16 education and skills – UK report

We have been looking in detail at the value of the current data and statistics on post-16 education and skills. As an independent UK wide regulator, we are in a unique position to take a broader look at issues of importance to society and to make the case for improved statistics, across organisational and Government boundaries. 

This report, our second report in this topic area, explores the public value of post-16 education and skills statistics across the UK with a focus on Scotland, Wales and Northern Ireland and updates on changes since the publication of our first, England only, report in 2019. 

Four key sectors comprise the majority of the post 16 education and skills statistics in the UK: workforce skills, universities and higher education, colleges and further education and apprenticeships, and each are covered in detail in our report. To our knowledge, this is the first time that the statistics that inform these sectors have been extensively researched at a UK wide level.  

Exploring the statistical landscape in this multi sector, multi country way has allowed us, to not only to identify the current challenges, information gaps and improvements to statistics in each sector, but to also highlight areas of good practice and shared learning opportunities. We have looked in detail as to how the current statistics are meeting the needs of users, focusing on the public value that the statistics give. In doing this we have been also been able to explore in detail how accessible the current statistics are and whether theare helping to inform a bigger, sector wide, picture. 

Post-16 education and skills affect the lives of millions of individuals in the UK. Good quality and accessible statistics are important to support the fair, efficient and effective provision of education and training. Alongside this report we will continue to engage with statistics producers to make the case for improved data and statistics in these sectors 

The state of the UK’s statistical system

This review sets out our view on the current state of government statistics. At their best, statistics and data produced by government are insightful, coherent, and timely. They are of high policy-relevance and public interest. There are good examples of statistics that effectively support decision-making in many areas of everyday life: this has been especially true during the COVID-19 pandemic, when we’re seeing the kind of statistical system that we’ve always wanted to encourage – responsive, agile and focusing on users. However, the statistical system does not consistently perform at this level across all its work.

In this report we address eight key areas where improvements could be made across the system.

  1. Statistical leadership
  2. Voluntary Application of the Code, beyond official statistics
  3. Quality assurance of administrative data
  4. Communicating uncertainty
  5. Adopting new tools, methods and data sources
  6. Telling fuller stories with data
  7. Providing authoritative insight
  8. User engagement

In each area, we highlight examples of statistical producers doing things well. These examples illustrate the good work already happening which others can learn from and build on. We have organised our reflections under the three headings of Trustworthiness, Quality and Value, the three essential pillars that provide the framework for the Code of Practice for Statistics.

User engagement in the Defra Group

Why we did this review

Understanding how statistics are used and what users and other stakeholders need is critical to ensuring that statistics remain relevant and provide insight. To achieve this, statistics producers must engage with users.

To explore this aspect of statistics production, we carried out a review of user engagement in the Defra Group. By the Defra group we mean the Core Department and Executive Agencies, Forestry Commission and those Defra Arm’s Length bodies that are designated as producers of official statistics: Environment Agency, Joint Nature Conservation Committee, Marine Management Organisation and Natural England.

This is our first departmental review of user engagement and the Defra Group made an ideal candidate for such a review. It has a large and broad portfolio of official statistics and National Statistics, with a varied public profile, public interest and impact and is therefore likely to require different approaches to engaging with users.

We focused our review on a set of 10 National Statistics and official statistics which reflect the diversity of the Defra Group statistics portfolio (see report Annex B). They cover a range of topics, users and uses, and represent the Defra core department as well as Arm’s Length Bodies.

What we hope to achieve

Through this review we aim to develop a better understanding of the range of approaches to user engagement currently adopted within the Defra Group, and to identify the key features of effective and impactful user engagement. We hope this will support the Defra Group in enhancing its user engagement and provide broader learning for other statistics producers.

Related links:

Correspondence: Ed Humpherson to Ken Roy: User engagement in the Defra Group

Blog: What we have learned from the Defra Group about user engagement

Presenting estimates of R by government and allied bodies across the United Kingdom

During the coronavirus (COVID-19) pandemic there has been increasing focus on, and interest in, the reproduction number – R. R is the average number of secondary infections produced by 1 infected person.

OSR’s observation of recent presentations of R is that generally a good job is being made of explaining both the number itself and its implications for the UK and each of the devolved nations. However, there is room for estimates of R to be presented more clearly and explained more meaningfully. Lessons can be learnt from the approach to publication of R by different nations of the UK.

Decision-makers across the UK have made it clear that decisions about how we come out of lockdown and whether or not any restrictions need to be re-introduced in future are informed by the value of R.

The latest estimates of R have become widely quoted by scientists, government officials and the media.
R for the UK is estimated by a range of independent modelling groups based in universities and Public Health England (PHE). Scientific advisers and academic modellers compare different estimates of R from the models and collectively agree a range which R is very likely to be within.

Devolved nations tend to use either those same independent models or one preferred model and apply data about the pandemic in their own countries to arrive at their consensus estimates of R. All devolved nations are publishing or intend to publish estimates for the range of R in their different countries on a regular (most on a weekly) basis. We commend the cooperation taking place between the four nations to bring about a consistent approach to R and where it should be published.

We’ve been impressed that explanations have succeeded in conveying the importance of the R-number and the role the estimates play in advice to ministers. We particularly commend;

The accessibility of the statistics

  • Estimates of R sit within a crowded, and sometimes confusing, landscape of other data and we found that broadly the needs of different types of users and potential users have been taken into account in the presentation and release of the statistics and data.

The presentation of uncertainty

  • For example, presenting R as being within a range clearly demonstrates the uncertainty in the estimate. We particularly liked the presentation of uncertainty in the Welsh Government’s Technical Advisory Cell Monitoring document which uses a fan chart to show the uncertainty. The use of estimates to one decimal place is also commended as it also conveys the uncertainty of the estimates.

The narratives about the estimates of R

  • These are particularly helpful when they are simply worded, adopt visually engaging summaries with charts and infographics about the R-number, and are presented alongside data. An example of helpful referencing to source data is the Scottish Government’s presentation Coronavirus: Modelling the epidemic in Scotland: Issue 2
  • We see the value of these narratives as helping to make sense of the decisions about school closures, social distancing and other measures aimed at reducing the spread of the virus.

We expect that as more data becomes available and more knowledge is gained about the pandemic, there will naturally be improvements in the presentation of R. Our observations suggest that producers can improve the value and quality of their statistics about R by:

Adopting even clearer language and terminology to describe estimates of R

  • For example, describing the estimates of R as ‘a consensus value’ alongside a range is confusing without explaining what is meant by a ‘consensus value’. Producers need to be clear about messaging whether potentially small changes in ranges for the value of R are statistically different from previous week’s consensus.

Linking to clear and easily accessible supporting materials

  • Cited research should demonstrably support the evidence and ideas being put forward.

Clearly explaining the sensitivity of the models to key assumptions

  • Users of these statistics who are more analytical or who want more information about the data before they are confident in the analysis, may wish to understand the sensitivity of the estimates of R to key assumptions in the models.

We advise people, when speaking publicly or writing about R, adopt due accuracy and provide sufficient context to avoid misleading people. Key learning from the presentation of R for the UK and for devolved nations to date has been;

  • Be careful to help people see R in the context of other data for example alongside data on the number of people infected, and other relevant factors such declining or increasing infection rates.
  • Clearly communicate the extent and nature of any uncertainty in the estimates. For example, clearly state the uncertain nature of the estimates and avoiding talking about estimates as if they are fact. Also, there is a need for even greater caution when infection rates become very low.
  • Be clear that estimates of R come from modelled assumptions, which is why different models can yield different estimates. Good practice is, where possible, take account of the results from various models to discuss the range for the possible values of R.
  • Be aware that some groups access information on coronavirus through hearing the narrative about the latest alone and are unable to see slides or graphical information. This places a responsibility on commentators to be clear and accurate in what they say.