In our latest blog, our Head of Data and Methods discusses the benefits and risks of AI in official statistics, and outlines OSR’s strategy for AI in the year to come…


Artificial Intelligence (AI) has quickly become an area of interest and concern, catalysed by the launch of user-friendly models such as ChatGPT. While AI appears to offer great opportunity, increased interest and adoption by different sectors has highlighted issues that emphasise the need for caution.

OSR is interested in actual and potential application of AI for production and use of official statistics, and to our own regulatory work, in the context of wider government use of AI. All Civil Service work must abide by the Civil Service Code to demonstrate integrity, honesty, objectivity and impartiality. Additionally, statistical outputs should follow the Code of Practice for Statistics by offering public value, being of high quality and from trustworthy sources. While AI models offer opportunities worth exploring, these need to be considered alongside risks, to inform an approach to use and regulation of use that is in line with government standards and supports public confidence.Code of Practice for Statistics by offering public value, being of high quality and from trustworthy sources. While AI models offer opportunities worth exploring, these need to be considered alongside risks, to inform an approach to use and regulation of use that is in line with government standards and supports public confidence.

This blog post outlines our current position on AI and our plans for monitoring and acting on AI opportunities and risks in 2024.


The benefits of AI

AI models can quickly analyse large volumes of data and return results in a variety of formats, although, at the time of publishing this post, we are not aware of any examples of AI being used to produce official statistical outputs. There are, however, feasibility studies being undertaken by the Office for National Statistics (ONS) and other Government Departments to support statistical publications, and some examples of AI use in operational research, such as:

  • improving searchability of statistics on their respective websites,
  • production of non-technical summaries,
  • recoding occupational classification based on job descriptions and tasks, and
  • automatically generating code to replace legacy statistical methods.

Risks of AI

While the potential benefits for AI use in official statistics are high, there are that warrant application of the Code of Practice for Statistics pillars of trustworthiness, quality and value.

Trustworthiness

There is concern around how AI might be used by malicious external agents to undermine public trust in statistics and government. Concerns include:

  • promoting misinformation campaigns ranging from targeted advertising to generated blog posts and articles, up to generated video and audio content from senior leaders such as Rishi Sunak and Volodymyr Zelenskiy.
  • Flooding social media and causing confusion around political issues such as general elections. AI could be used to generate more Freedom of Interest (FOI) or regulation requests than a department can feasibly handle, thus causing backlogs or losing legitimate requests in the chaos.
  • AI-generated (which are when a generative AI tool produces outputs that are nonsensical or inaccurate) presenting incorrect information or advice that might, at best, raise questions about how public sector organisations use personal data and, at worst, open public sector bodies up to legal action.

Quality

Significant concerns have been raised regarding AI model accuracy and potential biases introduced via their training data, as well as data protection of open, cloud-based models. The Government Digital Service found that their GOV.UK Chat had issues with hallucinations and accuracy that were unacceptable for public sector work. Given most AI models operate within a “black box” where the exact processes and methods are unknown and unable to be traced, it is difficult for producers to be completely transparent about how these systems produce the outputs. Close monitoring of developments in the field of AI and continual communication with statistics producers will be vital to understand the different ways AI systems may be used in both statistical production and statistical communication.

Value

The concerns around trustworthiness and quality of AI-generated statistical outputs and communications impacts their perceived value, both to organisations and to the public. The latest wave of the Public Attitudes to Data and AI Survey suggests that public sentiment towards AI remains largely negative, despite the perceived impact of AI being reported as neutral to positive. The potential value will emerge over time as more AI products make their way into widespread use.


OSR’s strategy for AI in 2024

We are considering AI and our response through two lenses:

  • Use of AI systems, such as Large Language Models (LLMs), in the production and communication of official statistics, and how OSR regulates this; and,
  • Responding to use of AI to generate misinformation.

Regulating use of AI systems in the production and communication of official statistics

There are many organisations developing guidance for how AI should be used and regulated. OSR is following these conversations. So far, we have contributed to the Pro-innovation AI Regulation policy paper from Department for Science Innovation and Technology, a white paper on Large Language Models in Official Statistics published by the United Nations Economic Commission for Europe, and to the Generative AI Framework for His Majesty’s Government, published by Central Digital and Data Office. We endorse the direction and advice offered in these frameworks and consider they provide solid principles that apply to regulation of use of AI in official statistics.

Responses to our recent review of the Code suggested people think that the Code does indirectly address issues around AI use for official statistics, both in terms of encouraging exploration of potential benefits and controlling quality risks. Going forward, providing guidance relating to specific issues around AI alongside the Code could allow OSR to provide relevant support in a dynamic way. We already have our Guidance for Models, which explains how the pillars in the Code help in designing, developing and using statistical models and is very relevant in this space. More widely, the Analysis Function will also be undertaking work to ensure that analytical guidance reflects the use of AI within analysis in future.

OSR will continue to discuss potential and planned use of AI in official statistics production with producers, to stay aware of use cases as they develop, which will inform our onward thinking.

Responding to use of AI to generate misinformation

With a UK election to be held this year it is vital to understand how AI systems may be used to compromise the quality of statistical information available to the public, and how the same technology may be used to empower producers and regulators of statistics to ensure statistics serve the public good. We will continue to be involved in several cross-Government networks that deal with AI. These include the Public Sector Text Data Subcommunity, a large network to develop best practice guidance of text-based data across the public sector, as well as other Government Departments and regulatory bodies thinking about use of information during an election.


Next steps

There will be many more unforeseen uses for this versatile group of technologies. As AI developments are occurring at speed, we will be regularly reviewing the situation and our response to ensure compliance with the Code. If you would like to speak to use about our thinking and position on AI, please get in touch via regulation@statistics.gov.uk. We are particularly keen to hear of any potential or actual examples of AI being used to produce official statistics.