Office for Statistics Regulation

Revising GDP: The challenge of uncertainty

The following blog was originally published by ESCoE, the Economic Statistics Centre of Excellence, following a panel session at the ESCoE 2024 conference, led by session panellists Marianthi Dunn, (then of the Office for Statistics Regulation), Sumit Dey-Chowdhury (Office for National Statistics), Johnny Runge (King’s College London) and chair Ed Humpherson (Office for Statistics Regulation)

Most important statistics are actually an “estimate” – a statistical judgement that is subject to some degree of uncertainty. So how can we best communicate that uncertainty while still maintaining trust in the statistics themselves?

Following criticism of notable UK GDP revisions in 2023 and resulting Office for Statistics Regulation Review of the Office for National Statistic’s (ONS’s) GDP approach, a panel session at ESCoE’s 2024 conference explored communicating this uncertainty.

Session panellists Marianthi Dunn, (then of the Office for Statistics Regulation), Sumit Dey-Chowdhury (Office for National Statistics), Johnny Runge (King’s College London) and chair Ed Humpherson (Office for Statistics Regulation) reflect on the session and the challenge of moving from public criticism to confidence.

No-one should have been surprised – or upset – by the reaction to the revisions Chris Giles (Financial Times)

“The revisions in 2023 were historically large: a cumulative 1.8% increase in the volume of output relative to the previous statistics is significant in any context. Moreover, in the context of UK trend growth of little over 1%, it represents almost two years of economic progress.

Furthermore, the level of UK GDP relative to other countries was a key measure of UK performance. The ONS used the baseline of the pre-pandemic level extensively as a headline measure. And in economic debate, international comparisons of this measure (published also by ONS regularly) became the gauge of government’s success for some commentators. At no time did ONS or the UK Statistics Authority (UKSA) suggest such comparisons were inappropriate.

So, no one at ONS or UKSA should have been surprised at the political and press reaction to the figures. Policy makers had to change thinking on fiscal and monetary policy, and the media were understandably interested. And there is little public understanding of GDP, let alone its measurement.

To make matters worse, the ONS release of Blue Book figures on 1 September was poorly written, which compounded issues around the narrative and public understanding. The main change to GDP was buried, and the headline was buried beneath secondary information. There was also little discussion of significant sectoral changes, such as the change in the contribution of Steel.

Turning to the media coverage, I felt that most of it was reasonable and not excessive, with a focus on the factual reporting of the change. There were some more pejorative pieces, but they seemed justified given the scale of the change.”

But there are lessons to learn (Marianthi Dunn, Office for Statistics Regulation (OSR))

“Drawing on the OSR report into GDP revisions, it is important to recognise the challenges of measuring GDP. These challenges occur even during periods of normal economic growth, and are compounded in times of rapid economic change, like the pandemic. And it is during times like these when reliable and timely statistics are needed the most. This is why international guidance recommends using three approaches to measurement of GDP: production, expenditure, and income. It is also important to note that this experience of revisions wasn’t unique to the UK. Most countries encountered similar challenges in estimating GDP during COVID-19.

GDP is a continuously evolving estimate, updated as new data become available. Revisions are normal, unavoidable and an inherent part of measuring the complexity of contemporary economies. They should not be seen as errors or corrections, much less as blunders or failures.

Despite these challenges, learnings from 2023 prompt three key questions for producers of economic statistics:

“How is the uncertainty which underlies the GDP estimates as they evolve, communicated?” The aim is to use sufficient qualitative and quantitative analysis to enhance public understanding of uncertainty, without reducing the trust in these estimates.
“How do the different users interpret the statistics, given their uncertainty?” During times of significant cyclical and structural changes, there is greater interest in these high-profile statistics. It is also important to improve access to, and usability of, explanatory information on GDP revisions.
“How do revisions reflect the main economic story?” The focus should be on improving access to data. During periods of significant economic change, there is a greater need to estimate GDP more precisely, using signals from different parts of the economy. The aim is to think about what data sources and methods can be used to reduce reliance on the more traditional assumptions of the production approach.

Communicating uncertainty is not straightforward (Sumit Dey-Chowdhury, ONS)

“There will always be uncertainty about the future, and the public may have a good degree of sympathy with this. However, the public may not typically understand that there is also uncertainty about the past. This is one of the main challenges for National Statistical Institutes like the ONS in communicating uncertainty.

A common factor in this uncertainty is the trade-off between timeliness and accuracy. Timely estimates of GDP are based on incomplete data; more accurate estimates are possible once more data is available, but they are not as timely. In essence, this is the conceptual problem that leads to revisions.

While this conceptual problem is well-recognised, there are barriers to public understanding. The first is that the idea of an early estimate of GDP, which is iterative and updated over time. This may clash with people’s perception of GDP as a fixed measurement of the economy.

Second, uncertainty can relate to very different economic circumstances. We can see this when comparing the Global Financial Crisis; COVID-19; and the Cost-of-Living crisis. The uncertainty of the Global Financial Crisis involved impact of asset valuations, liquidity crunches and solvency. The uncertainty of the pandemic was largely related to a huge shift in the nature and location of work. And in the context of the cost of living crisis, the nature of the uncertainty refers to capturing how individual households and individual firms are responding to these large price changes, including in how we produce real or volume estimates of GDP.

In this context, I believe the OSR’s recommendations could go further. Of course, there is value in exploring how we can improve communications (recommendation 2 of the Review). However, practical experiences of communicating uncertainty suggest that it is difficult. Additionally, the recent Bernanke Review highlighted the challenges of the Bank of England’s use of fan charts to convey uncertainty. This also has implications for official statistics.

Undoubtably the most important characteristic for a statistics producer is transparency: transparency about the data sources; transparency about the methods; and transparency about what the data does and doesn’t mean. The ONS already publishes information on the revision performance of early estimates of GDP but could do more to enhance transparency.”

What is a reasonable aim for us to have when communicating uncertainty? (Johnny Runge, King’s Policy Institute)

“While I see strong reasons to communicate uncertainty to policymakers and other expert audiences, I am conflicted on communicating uncertainty to a public, non-technical audience, at least before that uncertainty comes to fruition and changes the economic narrative.

Evidence for communicating uncertainty:

On one hand, I can see some clear potential advantages to communication of uncertainty. There is support from online experiments on communicating data uncertainty when publishing economic estimates, including papers by David Spiegelhalter (on unemployment), Galvao and Mitchell (on GDP), and Galvao (on productivity). The two Galvao papers were ESCoE research projects.

All three papers find that communicating uncertainty can lead the public to (rightly) perceive greater uncertainty in the numbers. Moreover, the papers find no evidence to suggest that conveying uncertainty undermines trust in the numbers and the source, though it does not increase trust either.

Evidence against communicating uncertainty:

My research on public understanding of economic statistics recommends communicating in a simpler and more engaging way and focusing on what is relevant and interesting for the public: in, short, to go back to basics. Communicating uncertainty does not feel like going back to basics.
Online experiments are not real life. In real life, uncertainty communication may be picked up by journalists and politicians. These actors may cherry pick figures from the uncertainty message. For example, both higher and lower figures, respectively, could be used to paint very different pictures of the economy. Indeed, our research generally shows people are most sceptical about statistics when they perceive that the figures can be manipulated to say to fit a specific narrative. And emphasising uncertainty may create exactly this perception in the minds of some members of the public.
To investigate this further, I undertook 20 qualitative interviews. This pointed to significant nuance behind online experiment findings. There are many different interpretations, including understandings/misunderstandings of communication tools. On GDP, people did not care enough to have strong views on uncertainty. On the other hand, people were often very surprised about the levels of uncertainty around employment.

So, what’s the way forward? I support the OSR’s recommendation for the ONS to take charge of this narrative. But I am sceptical about offering the communication of uncertainty to the wider public as a remedy to the challenges faced by statistics producers like ONS.”

Towards common ground (Ed Humpherson, Office for Statistics Regulation)

“The views within this panel session did not point to a single, clear consensus, reflecting the challenges of this complex topic.

Uncertainty is inherent to the measurement of the economy and developing ways of communicating it is an important area of focus. This cuts in at two levels: first, at the basic level of how individual statistics releases are explained. This involves ensuring that major shifts in the economic story are recognised, and that odd patterns in the data are identified and explained.

Second, statistics producers must ensure that the inherent uncertainty of GDP estimates is always conveyed, recognising that revisions may occur. This may require ongoing experimentation to identify the best communications approaches.

Even with the best communications approaches, there will still be problems. At heart, if the economic story changes, then this will always be big news and may affect public trust in economic measurement.”

ESCoE’s 2024 Conference on Economic Measurement took place on 15-17 May at Alliance Manchester Business School. The conference focuses on recent research advances in economic measurement and statistics. Slides and recordings from sessions (where available) are now ready to watch on the ESCoE website.

ESCoE blogs are published to further debate.  Any views expressed are solely those of the author(s) and so cannot be taken to represent those of the ESCoE, its partner institutions or the Office for National Statistics.

Is a picture really worth a thousand words?

In our latest blog, our Head of Private Office and Casework Manager, Kirsty, explains some of the do’s and don’ts when presenting statistics as part of an infographic.

You may be thinking why is a statistical regulator concerned with pictures and words, shouldn’t they be focusing on tables and numbers? Well the truth is, really we are concerned with how statistics are communicated, and with Social media making it easier than ever to communicate to new and large audiences it makes sense that statistics become part of that communication.

Many social media platforms thrive on simplified, engaging content – which can often take the form of data visualisation. One form in particular that we often see within OSR’s casework function is the use of infographics.

So, what is an infographic? It is defined as a visual representation of information or data as a chart or diagram. These infographics often have minimal text and aim to communicate information quickly and clearly to tell a story.

How can the OSR judge if an infographic is good or not? Well to do that we have to have communicated our expectations clearly. We refreshed our intelligent transparency guidance in 2023, which advocates for three principles to be the default approach to communicating all statistics, data, and wider analysis (in whatever form). Our principles state that you should have equality of access to the data used publicly by government, that data should be clearly sourced with appropriate context that enhances your understanding and this data should be independent of political influence and policy process

With intelligent transparency in mind we thought it might be useful to take a look at some of the infographics you have been concerned with recently and give you our thoughts on what makes them a good infographic, or an infographic that could be better.

A good example of an infographic

Here we have a tweet by HM Treasury that was raised as a concern with us. We actually found this to be a really good example of data in an infographic so lets explore why in more detail…

This infographic has a clearly identified source labelled (IMF, World Economic Outlook, April 2023)
There was clear context provided for the reductions in GDP (Covid)
Where there were forecasts these were clearly highlighted in blue AND there was a clearly identified source labelled for the forecast (IMF forecast)
The countries highlighted for comparison in this infographic were appropriate as they were comparable nations

The time period selected for this infographic was appropriate as 2010 was when the current Government came to power.

An infographic that could be improved

Turning to an example where we identified some areas for improvement, a tweet by DHSC that was raised as a concern with us. On this occasion we agreed that more could be done to improve the communication of these data. Lets explore why we thought that…

This infographic has no identifiable source, meaning that it fails our first principle of intelligent transparency as you do not have access to the data. (Just in case you want it you can find it here for 2020/21, 2021/22 and 2022/23)
The y axis on this infographic was not labelled and it also didn’t start at 0. This fails the government analysis function’s guidance on charts and also means it is lacking information that eases your understanding of the data, so that’s principle 2 of intelligent transparency missed too.

Last but by no means least on this infographic the lines appear to suggest a bigger relative increase in pay for newly qualified nurses than the data justify. When we responded to this concern we redrew the infographic to highlight this.

So why are we telling you this? Well we want to help anyone communicating data to portray their messaging in an accurate and trustworthy way that upholds our principles of intelligent transparency.

We often find that messages using data communicated in a clear, transparent way will be better received by the public. Rather than questioning the trustworthiness of the message presented, when the public are able to identify the source and context for the data, they are better able to take on the message being presented to them. Of course, we also want to reduce the opportunity for misinformation to be spread in the first place. If you are considering creating an infographic (or in fact communicating any statistics or data on social media) to communicate a message have a look here for our top tips.

So what about if you are a member of the public and you are concerned with an infographic or post you have seen on social media. We encourage you to be questioning of what you see. Guidance published by the House of Commons Library gives great advice on how to spot spin and inappropriate use of statistics. However, specifically for infographics what questions should you ask yourself?

First things first, if you are ever concerned about statistics, look for the source and identify if the source data is showing what the infographic says it is. Often the source data will also provide more comprehensive context to explain what the data do and do not show, as well as any uncertainty in the statistics.
If you have seen a post on social media, it is always a good idea to find out if what has been shared with you was actually what was shared originally. Sometime tweets or infographics are published in a thread giving you sources, caveats and context but this then gets lost if someone has copied and pasted the infographic, or just tweeted one tweet in a thread.
The third thing to do if you remain concerned with the use of an infographic is to advise the department who is responsible for the data. They may be unaware that the data is being misused. You can also raise a concern with us at OSR where we will consider your concern in our casework function.

Earned, not given: Public confidence in statistics and how this informs OSR’s work

Our Head of Research discusses how the findings from the 2023 Public Confidence in Official Statistics Survey are relevant to OSR.

At the Office for Statistics Regulation (OSR), we have a vision that statistics serve the public good. This vision guides our regulatory activities, which means we must have a deep understanding of how statistics can serve the public good, including what actions bring us closer to or further away from our goal.

This understanding is supported by evidence such as UK-wide surveys which gather public views about official statistics. For England, Scotland and Wales, this information is collected by the Public Confidence in Official Statistics (PCOS) survey, which is conducted independently on behalf of the UK Statistics Authority. For Northern Ireland, the Northern Ireland Statistics and Research Agency (NISRA) use the Northern Ireland Continuous Household Survey to explore Public Awareness and Trust in Official Statistics (PATOS). The findings from PCOS 2023, which were published this week (14 May 2024), and PATOS 2022 provide valuable insights into attitudes towards official statistics.

In this blog post, we highlight some of the findings from PCOS 2023 and PATOS 2022, and discuss how they align to our activities in OSR. This demonstrates how we see our work serving the public good.

While we are including data from both PCOS and PATOS in this blog post, the two are not directly comparable. For example, the surveys cover different years, and PCOS removes people from their analysis who said they ‘don’t know’ or didn’t provide an answer to questions, while PATOS includes ‘don’t know’ responses. Because of the difference in approach to which responses are included or not, when we say ‘respondents’ in this blog post for PCOS data we mean people who were able to express a response other than ‘don’t know’, whereas when referring to PATOS data ‘respondents’ includes people who indicated that they ‘don’t know’.

Building confidence through trustworthiness

As the Head of Research at a statistics regulator, you may need to forgive me for being specific and precise about technical statements in this blog post. However, I want to emphasise that these surveys report what respondents say, and in the case of PCOS, people who responded that they ‘don’t know’ or who didn’t respond are not included in reported percentages of respondents. The PCOS survey largely asks questions about the UK Statistics Authority’s production branch, the Office for National Statistics (ONS). As a NISRA-produced survey, PATOS focusses on NISRA statistics.

In both PCOS 2023 and PATOS 2022, the majority of survey respondents reported that they trust these statistics producers, with 87% reporting that they tended to trust ONS or they trusted it a great deal, and 85% reporting the same for NISRA.

At OSR, we assert that trust in official statistics is important if they are to serve the public good – people will not use evidence that they feel is unreliable . However, we also recognise that trust cannot be forced; it must be earnt. Our Code of Practice for Statistics (the Code) includes a pillar called ‘Trustworthiness’, which embodies the idea that producers of statistics must consistently demonstrate that they deserve trust. This pillar encourages honesty, independence, reliability and competence. By holding statistics producers to the standard for trustworthiness set out by the Code, OSR supports trust within our statistical system, which we assert helps statistics to serve the public good.

Addressing concerns with intelligent transparency

Equally important to knowing people’s reasons for trusting official statistics is understanding their reasons for not trusting them. In PCOS 2023, respondents who indicated that they did not trust ONS statistics were asked to select a reason for this. Respondents most commonly attributed their lack of trust to statistics being misrepresented by politicians (49%), although many also expressed that they believe the statistics alone do not tell the whole story (45%) and that the government has a vested interest in or manipulates the results (41%).

To address these reasons for not trusting statistics, we assert that there is a need for producer bodies to follow OSR’s principles of Intelligent Transparency:

Equality of access: Data used by the government in the public domain should be available to all in an accessible and timely way
Enhancing understanding: Citations for sources and appropriate explanations of context should be provided alongside the information
Independent decision making and leadership: Decisions about the publication of statistics and data (such as the content and timing) should be free from political influence

These three principles tie directly to the three most commonly reported reasons for not trusting ONS statistics:

Equality of access allows people to see the statistics themselves without having to rely on how they are represented by public figures
Enhancing understanding can help users to understand exactly what parts of the story statistics cover (and what parts lie beyond their scope)
Independent decision making and leadership may help alleviate concerns around government manipulation of official statistics

Advocating for, and supporting the implementation of, Intelligent Transparency is therefore another route OSR takes to building trust in official statistics. Intelligent Transparency sits alongside our casework function, where we investigate potential issues with official statistics such as how they are used in public debate. Here, we use our voice to stand up for statistics, reporting publicly where we have concerns and highlighting good practice. We anticipate that our work protecting statistics in public debate will allow more people to be confident in the use of official statistics, and therefore allow these statistics to better serve the public good.

Understanding the complexities of crime statistics

In this blog, our Head Statistics Regulator for Crime and Security discusses the difficulties in understanding and interpreting crime statistics, and what OSR is doing to support producers in improving the quality of crime statistics for England and Wales.

Crime statistics are complex

Statistics on crime are widely used by politicians, governments, researchers, the media, and the public to try to understand the extent and nature of crime. Often, the questions that people want to know the answers to seem relatively straightforward: Is crime going up or down? What types of crime are most common? How reliable are crime statistics? Is it possible to measure all crimes? But answering these seemingly simple questions can be surprisingly difficult.

Understanding and interpreting crime statistics for England and Wales is complex. This is mainly because there are two data sources on crime: statistics from the Crime Survey for England and Wales (CSEW), a household survey of individuals’ experience of crime; and police recorded crime statistics, which capture the number of crimes reported to and recorded by the police. These statistics are published quarterly by the Office for National Statistics (ONS).

Both data sources have their strengths and limitations. The CSEW is the best source for understanding long-term trends in crime covered by the survey. This is because the survey methods have changed little in the last 40 years and the survey is not affected by changes to police crime recording practices or people’s willingness to report crime to the police. In addition, the survey captures crimes that aren’t reported to the police.

On the other hand, the survey doesn’t capture all crimes. For example, as it’s a household survey, it doesn’t capture crimes against businesses and organisations such as shoplifting. There are also challenges with the survey’s response rate, among other factors that affect the quality of the statistics, which led to the temporary suspension of their accreditation.

The police recorded crime statistics are a better indicator of police activity than trends in crime, because many crimes are not reported to the police. However, the statistics do provide insight on some higher-harm but less-common crimes such as homicide or knife crime, which the CSEW does not cover or does not capture well.

The police recorded crime statistics also cover a broader range of offences than the CSEW because the police also record crimes against businesses and organisations and crimes against society and the state, such as drug offences and public order offences. And the police recorded crime statistics are more granular than the CSEW statistics – the number of offences is broken down by police force area.

Due to these strengths and limitations, it’s important to look at both sources together to get the most complete understanding of crime in England and Wales. ONS’s Crime trends in England and Wales article provides a good guide on how to interpret both sources. It explains which source is best for which purpose. For example, it recommends using CSEW statistics to look at trends in fraud but recommends using police recorded crime statistics to look at trends in knife crime.

Our work on crime statistics for England and Wales

Crime statistics are a priority area for our regulatory work. It’s been a particularly busy period for regulatory work on crime statistics, and the coming months will continue to be busy. The quality of the statistics has been our main focus. One of the questions we’ve been focused on is ‘How reliable are the statistics?’.

Today, we published a detailed report on the quality of the police recorded crime statistics for England and Wales. Our review took stock of how data quality has improved since 2014, when we removed the accreditation of the statistics due to quality concerns. We found that police forces have made significant improvements to crime recording in the last ten years. This has given us greater confidence in the quality of the data. But we found some gaps in the Home Office’s oversight of police force data quality and in ONS’s communication of quality that we have asked to be addressed.

One subset of the police recorded crime statistics that we didn’t look at in our review is fraud and computer misuse statistics. That’s because the process for recording these crime types is different from that used for other crime types. We’re aware of the increased public debate about the scale of fraud and its impact on victims. To give this topic the attention it deserves, we’re doing a separate review of the quality and value of fraud and computer misuse statistics. We’ll publish the review later this year.

Like other UK household surveys, the CSEW has suffered from a lower response rate since the pandemic, which has impacted the quality of the statistics. We’re reviewing the quality of the CSEW statistics soon with a view to reaccrediting them.

We recognise that crime will be an important issue in the upcoming UK General Election. To support the appropriate use of crime statistics, we will be publishing a ‘What to watch out for’ explainer at the end of May that provides some tips and advice and sets out some of the common mistakes in public statements about crime that we have seen. It explains that it’s always better to look at the CSEW and police recorded crime statistics together to get an overall picture of crime in England and Wales.

Through this range of work, we are gaining a good understanding of the current state of crime statistics for England and Wales, helping us to support public confidence in the quality and value of the statistics and to continue to promote their appropriate use.

Data in debate: The role of statistics in elections

In our latest blog, our Head of Casework and Director General set out the guidance and support available for navigating statistics during an election campaign, and our role in publicly highlighting cases where statistics and data are not published or presented in a misleading way.

Intelligent transparency is something we talk about a lot in OSR. It involves taking an open, clear, and accessible approach to the release and use of data and statistics by default. It’s something we care about deeply, as public confidence in publicly quoted statistics is best enabled when people can verify and understand what they hear.

Taking a transparent approach by default will be particularly important during the upcoming general election campaign, where statistics will likely play a role in informing decisions made by the electorate but opportunities for governments to publish new analysis will be restricted. This is because in the weeks leading up to an election, known as the pre-election period, the Cabinet Office and Devolved Administrations set rules which limit public statements or the publishing of new policies and outputs.

Official statistics are unique in this respect as routine and preannounced statistics can continue to be published during this time, in line with the Code of Practice for Statistics. However, given that the pre-election ushers in a period of public silence for most government department activity, the publication of new information should be by exception. Any public statements made during the pre-election period should only refer to statistics and data that are already in the public domain to ensure that the figures can be verified and to avoid the need to publish new figures.

Part of our role as a statistics regulator is to promote and safeguard the use of statistics in public debate. We do not act to inhibit or police debate, and we recognise that those campaigning will want to draw on a wide range of sources, including statistics, to make their case for political office. Nevertheless, we will publicly highlight cases where campaigning parties have made statements that draw on statistics and data that are not published or presented in a misleading way.

Our interventions policy guides how we make these interventions, but we recognise that election campaigns require particularly careful judgement about when to intervene. This is why we’ve published our Election 2024 webpage, which brings together our guidance and support on election campaigns. This includes new guidance on the use of statistics in a pre-election period for government departments which sets out our expectations for how they should handle cases where unpublished information is referred to unexpectedly.

Reacting to misuse is not our only tool. This election, we want to do more up front to help people navigate through the various claims and figures thrown about during an election. This is why we are launching a series of explainers on key topics that will cover what to look out for and the common mistakes in public statements that we have seen through our casework across topics which are likely to feature in an election campaign.

We are also working in partnership with other organisations and regulators whose vision is aligned with ours and who support the good use of evidence in public debate. Our hope is that as a collective, we can contribute to the effective functioning of the election campaign.

We are not an all-purpose fact-checking organisation, nor are we the regulator of all figures used in public statements. However, while we can’t step into every debate, we will take all the necessary steps we can to ensure that the role of statistics in public debate is protected and that the electorate is not misled.

Anyone can raise a concern about the production or use of statistics with us. You can find out more about our remit and how to raise a concern with us by visiting our casework page.

Fostering a robust government evaluation culture

Following the publication of our Analytical leadership: achieving better outcomes for citizens report in March 2024, we are running a series of blogs to highlight examples of strong analytical leadership in practice. Analytical leadership is a professional way of working with data, analysis or statistics that ensures the right data are available for effective policy and decision-making to improve the lives of citizens. Everyone in government can demonstrate analytical leadership, regardless of their profession or seniority by drawing on the six enablers of analytical leadership and a ‘Think TQV’ approach.

As the first blog in this series, Catherine Hutchinson, Head of the Evaluation Task Force (ETF), talks about what the ETF has been doing to make sure that robust evidence is the driving force behind government spending decisions on policies and programmes. The ETF is all about building a culture where evidence takes centre stage, helping the government make smarter, more effective choices that really make a difference in people’s lives.

The work of the ETF has clear relevance to our analytical leadership findings, particularly the need to ‘foster an evidence-driven culture’, ‘demonstrate transparency and integrity’, ‘invest in analytical capability and skills’, and ‘draw on analytical standards and expert functions.’

What is the Evaluation Task Force?

The Evaluation Task Force is a joint Cabinet Office-HM Treasury unit providing specialist support to ensure that robust evidence sits at the heart of government spending decisions. Its goal is to increase the effectiveness and efficiency of Government decision-making, and to improve confidence that the policies and programmes the government invests in are actually working and delivering the results that matter most. We were once described by a minister in a speech as the antidote to the “su gar rush” of policy making!

What have we been up to?

Since we got started in April 2021, the ETF has made big strides in building evaluation into government’s spending processes, supporting departments to design and deliver robust evaluations, and building capability of our civil servants to create better evidence. The ETF has advised on over 380 programmes across government valued at £202 billion, helping departments to design and deliver robust evaluations. These evaluations will go on to generate evidence in key areas of government policy. These are individual programmes we are supporting, but the real impact comes from the culture change and changes to ways of working.

How are we fostering an evidence culture?

We’re delivering a range of activities to foster a culture of robust evaluation, while also promoting transparency, collaboration, enhancing evaluation capabilities and skills, and the use of analytical standards. As well as our role with HMT making sure evidence is underpinning spending proposals we do a lot with departments

Supporting departments to design and deliver robust evaluations

We’ve supported 18 departments to develop and publish evaluation strategies which set out each department’s approach to evaluating programmes and building robust evidence, ensuring evaluation is integral to each department’s policy making process.
We are updating the Government’s major projects review, last conducted by the Prime Minister’s Implementation Unit in 2019. The findings, which are expected this year, will help us identify critical barriers, and suitable projects within the £805 billion portfolio, where the ETF can make a significant difference by providing evaluation advice and support.
The ETF manages the Evaluation and Trial Advice Panel (ETAP) – a free service to support civil servants to develop high quality and robust evaluations. The ETAP delivers a range of services including advice surgeries, one-to-one advice, document reviews, and teach-ins. The Panel has been running since 2015 and has supported over 200 government programmes during that time, promoting the use of high quality evidence across the civil service.

Building capability of our civil servants to create better evidence

We have built evaluation capability across the civil service by developing and delivering the Evaluation Academy, a train the trainers programme. The Academy has so far trained more than 100 civil servants, equipping them with the skills and knowledge they need to pass on to colleagues to make sure evidence informs their work. These trainers have already trained over 1,100 people in their departments.
The ETF has published an updated 5-year What Works Strategy, which outlines the government’s approach to improving the way it uses evidence from the “What Works” network to inform decisions about public services.

Funding innovative approaches and expanding the evidence base

We teamed up with our HM Treasury colleagues to design and scope the Labour M arkets Evaluation and Pilots Fun d. The fund is providing £37.5 million to expand the evidence base on what works to improve labour market outcomes in the economy. This fund supports the generation of high-quality evidence to drive positive change in the labour market. For example, we are funding a £7.4 million pilot scheme to support AI skills for businesses announced in the recent Budget.
The ETF also manages the £15m Evaluation Accelerator Fun d (EAF) to support a range of departments to plug key evidence gaps in priority policy areas across government such as crime, health and youth-wellbeing and supporting innovative approaches to deliver public services. The fund helps create evidence-based solutions to pressing policy challenges and encourages innovative, data-driven approaches to public service delivery.

For example, earlier this year we had our first-ever evaluation themed ministerial visit, with the Minister for the Cabinet Office (MCO) going up to West Yorkshire to know more about the EAF-funded Domestic Abuse forensic marking project, part of a programme of interventions to tackle violence against women and girls, as well as projects designed to make the most of AI and machine learning.

What are we doing next?

We’re excited to launch a new Government Evaluation Registry this month. The Registry will bring together all planned, live and completed evaluations from Government Departments in a single accessible location, providing an invaluable tool for understanding “what works” in Government. The Registry will make it easier for policymakers and practitioners to access and use analytical evidence in their work.

Longer term, we will be continuing our work on embedding evaluation into major government programmes, so that evaluation becomes part of the project delivery furniture. We will be making sure all departments publish findings on policy evaluations in a timely and transparent way – stopping people reinventing the wheel and reinventing the broken wheel too! Ultimately, our mission is to make the UK Government a world leader in evidence-based policy making.

How can you get involved?

If you’re a civil servant who wants to learn more about evidence-based policymaking or want to contribute to the ETF’s mission, consider:

Get in touch with the friendly Evaluation Leads for your department and understand how you can work with them, for example by attending the sessions the Evaluation Academy trainers put on to build your skills in using evidence to inform your work.
Seeking advice from the Evaluation and Trial Advice Panel (ETAP) when designing an evaluation for your policy or programme to ensure it meets the highest standards of analytical rigour.
Exploring the Government Evaluation Registry (once launched later this month) to learn from the findings of past evaluations and identify opportunities for collaboration in generating and using evidence to drive better outcomes.

Our work in building an evidence-driven culture within the government is absolutely crucial for ensuring that public funds are spent effectively and efficiently, and that policies and programmes are delivering benefits. By prioritising robust evaluation and evidence-based decision-making, the government can continuously improve its services and better serve the needs of its citizens.

Please get in touch with us to find out more – etf@cabinetoffice.gov.uk – or visit our website.

What does it mean to be an accredited official statistic?

In our latest blog one of our Head Statistics Regulators discusses what it means to be an accredited official statistic and how official statistics accreditation can help users and producers of statistics…

Here at the OSR we are responsible for carrying out assessments to determine whether a set of statistics can be confirmed as accredited official statistics – a designation previously known as badged as National Statistics.

I often get asked what does it mean to be an accredited official statistic?

The OSR defines accredited official statistics as official statistics that we have independently reviewed and confirmed as complying with the standards of trustworthiness, quality and value in the Code of Practice for Statistics (the Code).

For me, accreditation is a shortcut. It’s a quick way of signalling to users of the statistics that the standards of the Code have been met. It’s similar to a quality mark, but what quality is being assessed: that of the processes to produce the statistics or that of the statistics themselves?

“It’s more than just quality”

The Code encompasses much more than quality. Accreditation implies not only the good quality of the statistics themselves but also that they are presented, quality assured and disseminated according to set standards. It is about the value of the statistics for users and whether they are robust enough to bear the weight of decision-making required of them. Thus, accreditation considers both the processes and the statistics themselves and the structures, people and organisations, that support statistics planning, production and communication.

“Context and use matter”

When we assess the quality of the statistics themselves we are not looking for them to meet a ‘gold standard’. We recognise that any statistic is only ever a best estimate at a particular point in time. It also depends on context and whether the statistics are good enough for their intended use, which will vary according to user and societal need. For example more-timely but less-accurate data (due to gaps in data sources to ensure timeliness) may be acceptable in one context but would not be in another. It can take two years after the reference period for a relatively comprehensive picture of the economy and so GDP, to emerge. However, more-timely GDP statistics are needed to inform policy, budgeting, investment and employment decisions in the public and private sectors.

However, what we do require is for producers of statistics to ensure that they are producing the most appropriate estimate available by ensuring suitable data sources are used, methods are robust and estimates are quality assured. This work should be carried out with the context and use of the statistics in mind and in an open, professional and transparent way, by making clear any limitations to the data, inherent uncertainty due to the timeliness of the data, or planned revisions so users can use the statistics appropriately for their needs. More detail on our approach to quality is provided in our publication Quality and statistics: An OSR perspective.

What if the statistics are not accredited?

If statistics are not accredited, it doesn’t mean that they aren’t trustworthy, of a high quality or valuable. It also doesn’t mean that they are. What it does mean though is that we haven’t independently checked that they comply with the standards of trustworthiness, quality and value in the Code. Being a chartered naval architect or having a plumbing and heating qualification signals that someone external has checked you can do something to a particular standard, e.g. build and design ships or install and maintain boilers and heating appliances. However, that doesn’t mean someone without official qualifications can’t also do the same things. You would just want to look for more evidence that they can, e.g. references, evidence they understand the relevant standards and rules they should be following etc.

We encourage all users of statistics, to ask themselves some questions to ensure the data are fit for their purposes. These include considering where the data has come from, why has it been collected, how well the data fits with the concept you are trying to measure, what checks have been carried out to assure the data, how can you access the data. Some more detail on things to consider are set out in our guidance on questions for data users.

What if the accreditation is removed?

Legally only we (the UK Statistics Authority) can remove the badge, i.e. the accreditation, from a set of statistics. We may decide on this course of action for a number of reasons. These could be related to concerns around the quality of the data sources used to produce the statistics, where user need is not being met or where substantial changes to the data sources and methods require us to conduct a review to ensure the quality of the data is such that they continue to be applicable for their intended use. The reason(s) should be included in the release and/or the announcement explaining why the accreditation has been removed.

What if some of the input statistics are not accredited?

Different sets of statistics often feed through into others. For example, migration data are used to inform population estimates and projections. Data from the labour force survey feed into productivity estimates. Producers of data and statistics should always quality-assure their data sources and be aware of any changes to them. The extent of quality assurance required and the weight placed on different data sources will vary depending on factors such as how much they affect the overall calculation, or whether there are any alternatives. We would expect this information to be communicated to users so they can understand the quality-assurance processes carried out and why the producer has decided that the data sources are fit for use. This is the case regardless of whether the source data are accredited or non-accredited.

How do I get my statistics accredited?

If you are a government department of official body that produces official statistics, and you have a set of statistics that you consider meet the standards of trustworthiness, quality and value in the Code then you can ask us to assess them. The benefits of doing so include:

An independent assessment of the processes, methods and outputs used to produce the statistics against the recognised standards of the Code.
Public demonstration of your organisation’s commitment to trustworthiness, quality and public value.

Your first step towards assessment is to talk with your Head of Profession for Statistics who can provide guidance and points to consider.

If you are producer of data, statistics and analysis which are not official statistics, whether inside government or beyond you can contact us to discuss voluntarily application of the Code. While this approach will not lead to accredited official statistics it is a public demonstration of your commitment to the standards of the Code, which many organisations find beneficial for their work.

How do I find out more?

further information about accredited official statistics: statisticsauthority.gov.uk/accredited-official-statistics
a range of additional guidance and tools to help understand what the Code it and how we apply it in practice: https://code.statisticsauthority.gov.uk/
questions for data users: https://code.statisticsauthority.gov.uk/questions-for-data-users/
top tips when using statistics infographic: https://code.statisticsauthority.gov.uk/for-communications-staff/
universal principles – for producers https://code.statisticsauthority.gov.uk/code-universal-principles/
guidance for producers on the OSR site: https://osr.statisticsauthority.gov.uk/guidance/
top tip sheet on using statistics for communications staff: https://code.statisticsauthority.gov.uk/for-communications-staff/

Our current position on regulating, responding to and using AI

In our latest blog, our Head of Data and Methods discusses the benefits and risks of AI in official statistics, and outlines OSR’s strategy for AI in the year to come…

Artificial Intelligence (AI) has quickly become an area of interest and concern, catalysed by the launch of user-friendly models such as ChatGPT. While AI appears to offer great opportunity, increased interest and adoption by different sectors has highlighted issues that emphasise the need for caution.

OSR is interested in actual and potential application of AI for production and use of official statistics, and to our own regulatory work, in the context of wider government use of AI. All Civil Service work must abide by the Civil Service Code to demonstrate integrity, honesty, objectivity and impartiality. Additionally, statistical outputs should follow the Code of Practice for Statistics by offering public value, being of high quality and from trustworthy sources. While AI models offer opportunities worth exploring, these need to be considered alongside risks, to inform an approach to use and regulation of use that is in line with government standards and supports public confidence.Code of Practice for Statistics by offering public value, being of high quality and from trustworthy sources. While AI models offer opportunities worth exploring, these need to be considered alongside risks, to inform an approach to use and regulation of use that is in line with government standards and supports public confidence.

This blog post outlines our current position on AI and our plans for monitoring and acting on AI opportunities and risks in 2024.

The benefits of AI

AI models can quickly analyse large volumes of data and return results in a variety of formats, although, at the time of publishing this post, we are not aware of any examples of AI being used to produce official statistical outputs. There are, however, feasibility studies being undertaken by the Office for National Statistics (ONS) and other Government Departments to support statistical publications, and some examples of AI use in operational research, such as:

improving searchability of statistics on their respective websites,
production of non-technical summaries,
recoding occupational classification based on job descriptions and tasks, and
automatically generating code to replace legacy statistical methods.

Risks of AI

While the potential benefits for AI use in official statistics are high, there are that warrant application of the Code of Practice for Statistics pillars of trustworthiness, quality and value.

Trustworthiness

There is concern around how AI might be used by malicious external agents to undermine public trust in statistics and government. Concerns include:

promoting misinformation campaigns ranging from targeted advertising to generated blog posts and articles, up to generated video and audio content from senior leaders such as Rishi Sunak and Volodymyr Zelenskiy.
Flooding social media and causing confusion around political issues such as general elections. AI could be used to generate more Freedom of Interest (FOI) or regulation requests than a department can feasibly handle, thus causing backlogs or losing legitimate requests in the chaos.
AI-generated (which are when a generative AI tool produces outputs that are nonsensical or inaccurate) presenting incorrect information or advice that might, at best, raise questions about how public sector organisations use personal data and, at worst, open public sector bodies up to legal action.

Quality

Significant concerns have been raised regarding AI model accuracy and potential biases introduced via their training data, as well as data protection of open, cloud-based models. The Government Digital Service found that their GOV.UK Chat had issues with hallucinations and accuracy that were unacceptable for public sector work. Given most AI models operate within a “black box” where the exact processes and methods are unknown and unable to be traced, it is difficult for producers to be completely transparent about how these systems produce the outputs. Close monitoring of developments in the field of AI and continual communication with statistics producers will be vital to understand the different ways AI systems may be used in both statistical production and statistical communication.

Value

The concerns around trustworthiness and quality of AI-generated statistical outputs and communications impacts their perceived value, both to organisations and to the public. The latest wave of the Public Attitudes to Data and AI Survey suggests that public sentiment towards AI remains largely negative, despite the perceived impact of AI being reported as neutral to positive. The potential value will emerge over time as more AI products make their way into widespread use.

OSR’s strategy for AI in 2024

We are considering AI and our response through two lenses:

Use of AI systems, such as Large Language Models (LLMs), in the production and communication of official statistics, and how OSR regulates this; and,
Responding to use of AI to generate misinformation.

Regulating use of AI systems in the production and communication of official statistics

There are many organisations developing guidance for how AI should be used and regulated. OSR is following these conversations. So far, we have contributed to the Pro-innovation AI Regulation policy paper from Department for Science Innovation and Technology, a white paper on Large Language Models in Official Statistics published by the United Nations Economic Commission for Europe, and to the Generative AI Framework for His Majesty’s Government, published by Central Digital and Data Office. We endorse the direction and advice offered in these frameworks and consider they provide solid principles that apply to regulation of use of AI in official statistics.

Responses to our recent review of the Code suggested people think that the Code does indirectly address issues around AI use for official statistics, both in terms of encouraging exploration of potential benefits and controlling quality risks. Going forward, providing guidance relating to specific issues around AI alongside the Code could allow OSR to provide relevant support in a dynamic way. We already have our Guidance for Models, which explains how the pillars in the Code help in designing, developing and using statistical models and is very relevant in this space. More widely, the Analysis Function will also be undertaking work to ensure that analytical guidance reflects the use of AI within analysis in future.

OSR will continue to discuss potential and planned use of AI in official statistics production with producers, to stay aware of use cases as they develop, which will inform our onward thinking.

Responding to use of AI to generate misinformation

With a UK election to be held this year it is vital to understand how AI systems may be used to compromise the quality of statistical information available to the public, and how the same technology may be used to empower producers and regulators of statistics to ensure statistics serve the public good. We will continue to be involved in several cross-Government networks that deal with AI. These include the Public Sector Text Data Subcommunity, a large network to develop best practice guidance of text-based data across the public sector, as well as other Government Departments and regulatory bodies thinking about use of information during an election.

Next steps

There will be many more unforeseen uses for this versatile group of technologies. As AI developments are occurring at speed, we will be regularly reviewing the situation and our response to ensure compliance with the Code. If you would like to speak to use about our thinking and position on AI, please get in touch via regulation@statistics.gov.uk. We are particularly keen to hear of any potential or actual examples of AI being used to produce official statistics.

A statistical jigsaw: piecing together UK data comparability

In our latest blog, our Head of Private Office discusses comparability of data across the UK, which was topical at a recent Public Administration and Constitutional Affairs Select Committee…

When Ed Humpherson, Director General of OSR gave evidence recently to the Public Administration and Constitutional Affairs Select Committee (PACAC), one of the issues raised at the session was comparability of data across the UK.

For context, in 2023 the Committee launched their inquiry focused on transforming the UK’s statistical evidence base (you can read more about the issue of transparency that Ed explored with the Committee in his earlier ). Ed was the last witness to give evidence to the inquiry and the issue of comparability came up in several of the previous sessions with other witnesses.

Meeting user needs is not always straightforward, especially when that need is comparing data across the UK. As Ed explained to the Committee, the configuration of public services will probably be different across the UK, because of different policy and delivery choices that have been made by the distinct devolved governments. This is the nature of devolution, but a consequence is that administrative data may be collected, and reported, on different bases.

In our view, though, it is not sufficient for producers to simply state that statistics are not comparable. In line with the Code of Practice for Statistics they should recognise the user demand, and explain how their statistics do, and do not, compare with statistics in other parts of the UK. And producers should undertake analysis to try to identify measures that do allow for comparison, or to provide appropriate narrative that helps users understand the extent of comparability.

A very good example of this approach is provided by statisticians in the Welsh Government. Their Chief Statistician published two blogs on the comparability of health statistics, Comparing NHS performance statistics across the UK and Comparing NHS waiting list statistics across the UK. These blogs recognise the user demand and set out additional analysis carried out by analysts at the Welsh Government in collaboration with analysts in NHS England to accurately understand the differences between the definitions of NHS waiting times between the two nations. The blogs then adjust Wales’s own figure to produce an additional measure which is broadly comparable with that of England. More generally, the Chief Statistician’s blogs are a good example of providing guidance and insight to users across a wide range of statistical issues.

In addition, the Welsh Government’s monthly NHS performance release also highlights what can, and cannot, be compared.

And it’s not just the Welsh Government. During the evidence session Ed also mentioned the approach taken by NHS England to highlight the most comparable accident and emergency statistics. NHS England provide a Home Nations Comparison file for hospital accident and emergency activity each year. Since the session, statisticians from the across the UK have jointly produced analysis of the coherence and comparability of A&E statistics and advice on how they should and should not be compared, published on 28 February.

More generally, statisticians across the UK are undertaking comparability work across a range of measures. It is also important to recognise that at the level of health outcomes – things like smoking rates and life expectancy – figures are less related to the delivery of NHS services and are therefore more readily comparable. In addition to work on health comparability, statisticians have examined other cross-UK issues. For example, there is also very a good analysis of differences in fuel poverty measurement across the four nations.

So, whilst we at OSR, of course, champion comparability of data and believe it should be a priority for government, we are not alone. The examples in this blog demonstrate that statisticians are recognising, and taking steps to meet, user demand for comparability. And we have written to the Committee to highlight the activities that are described here.

We are looking forward to the results of the enquiry and their recommendations on how we can all have a role in transforming the UK statistical evidence base for the better.

Related correspondence:

Ed Humpherson to William Wragg MP: Supplementary evidence to the Public Administration and Constitutional Affairs Committee

How do we use statistics in everyday life?

In our latest guest blog, Johnny Runge and Beti Baraki from the Policy Institute at King’s College London discuss how individuals may use statistics in their personal lives, and they ask you to get in touch with suggestions.

We are launching an exciting new project on whether and how people use statistics to make personal decisions, and we want your help. Please read more below, and then complete this short survey.

Our lives are shaped by the decisions we make, and they can often be daunting, especially when they are big life decisions. But how do we arrive at these decisions? In unravelling this, we can consider whether decisions are deliberate or coincidental, if they rely on intuition, or tradition, and what role other people play in this, such as whether we seek advice or are affected by friends and family or companies, celebrities and influencers.

Within this inevitably complex range of factors is our research question: to what extent do we consider evidence, data and official statistics when we make a personal decision?

For instance,

Before they decide on naming their baby, do parents look at statistics on the most and least common baby names?
Do they look at school statistics, including Ofsted results, when they are choosing a school for the child?
Do people explore university rankings or statistics on graduate employment outcomes when completing their UCAS application?
Do they look at data on civil servants’ experiences of working in government departments before applying for a civil service job?
Do they look at local house price data or crime data when deciding where to live?
And what about all those small everyday decisions – does data on bathing water quality inform decisions whether and where to take up outdoor swimming? Do pollen forecasts inform decisions to go for a walk in the park?

If statistics are taken into account, we want to deepen understanding of this process. We seek to gain insights into how deliberate or not the use of statistics is, including whether people realise they are using it at the time, and whether they actively seek out these statistics and data or randomly come across them. For instances where statistics are used, we are also interested in the extent to which individuals feel this improved their decisions.

If people do not consider any data, we want to better make sense of why they do not. We aim to uncover whether these individuals would have done so, if they knew relevant statistics existed, if they had been more accessible, or if they had trusted them more.

These are the types of questions we will explore in a new project using semi-structured interviews, commissioned by the Office for Statistics Regulation (OSR), and led by the Policy Institute at King’s College London, in collaboration with the Behavioural Insights Team. OSR has previously written about why they are interested in the role statistics play in personal decision-making, and how they see it as crucial that the wider public can use official statistics directly to inform their decisions.

To start the project, we want to create a list of as many examples as possible about how statistics can (or should) be used to inform personal decisions. We ask for your help, whether you are a member of the public, or a statistician, researcher or policymaker.

Do you have any ideas or suggestions about decisions that could (or should) be informed by data or statistics? Or, if you are an expert in a certain area, think about the key statistics in your area, and think about what personal decisions they could potentially inform? If that is the case, we would really appreciate if you complete this brief survey.