Introduction

The Office for Statistics Regulation (OSR) provides independent regulation of all official statistics produced in the UK. Statistics are an essential public asset: we aim to enhance public confidence in statistics produced by government by setting the standards they must meet in the Code of Practice and the pillars of Trustworthiness, Quality, and Value (TQV). In doing this, we have found the pillars to be useful in areas beyond official statistics production: for example. in production and use of administrative data. This guidance takes this idea and highlights the use of TQV in the design and production of models both within and beyond the statistics landscape.

Here, a model is defined as a tool used to create statistics, or to extract meaning from data for decision making. In this guidance, it primarily implies a data science or statistical model but could also be used to refer to an analytical model, conceptual model, mathematical model or data-driven model.

The term algorithm may also be appropriate in places. An explanation of terms can be found in Annex A.

As a model encompasses data, systems, and techniques, it is important that these components are not considered in isolation from one another. That is, considerations around data cannot exist without considerations of techniques; and neither exist independently of systems. It is this holistic perspective that the pillars of TQV enhance.

What do we mean by TQV of models?

Trustworthiness

Having the confidence in those producing models that are to be used in the public domain

Quality

Data and methods that produce assured model outputs

Value

Models that support society’s needs for information

Who is this guidance for?

This guide is for anyone designing, developing, changing, reviewing or using models who wishes to uphold the principles of Trustworthiness, Quality and Value in their work. We also present some considerations for those thinking about using or changing models in the future. Users of this guidance may be those who are involved in:

  • the design, creation or use of models to generate statistics/data used to inform decision making
  • producing or using models to create new, exploratory statistics or testing new models for current statistics
  • informing public policy using outputs from a model
  • referencing data/statistics generated by a model in the public domain
  • reviewing and validating models used to generate statistics or data used to inform decision making

What is the aim of this guidance?

The three Code of Practice pillars – Trustworthiness, Quality and Value (TQV) – provide an excellent framework for models used to produce official statistics and beyond. We have sought to highlight areas where TQV can be useful when building models. If you are interested in making a public commitment to TQV please consider Voluntary Adoption of the Code of Practice. Voluntary application (VA) of the Code is for any producer of data, statistics and analysis which are not official statistics, whether inside government or beyond, to help them produce analytical outputs that are high quality, useful for supporting decisions, and well respected.

Why are we producing this now?

Traditional statistical techniques, such as linear regressions, have long been used to create statistics or generate data used to inform decisions. However, recently, newer techniques including machine learning are being used to inform statistics production (for example, Using traffic camera images to derive an indicator of busyness: experimental research, ONS) and to inform decisions (for example, How the Ministry of Justice used AI to compare prison reports, MoJ). These statistics and decisions have impacts on society and it’s led us to think about public confidence in this space.

There have also been high-profile cases of models being used in decision-making within the public sector with mixed public acceptability. This was highlighted in our report on awarding GCSE, AS, A level, advanced extension awards and extended project qualifications in summer 2020. Within this report we identified 41 lessons, aimed at Public Bodies, to consider when building models in the future and we tested these lessons on a model which was more widely accepted by the public, the model used to predict who should be added to the Covid shielding list, also known as QCOVID. The lessons held up well and have been incorporated into this version of the models guidance (beta version).

What does this guidance not cover?

This is not a regulatory document for data science, machine learning or artificial intelligence (AI)

The aim of this document is not to say that we, at OSR, will be regulating all AI tools. Effective, and appropriate regulation of these areas requires cross-sector agreement of which OSR is part of and of which is still in progress. This guidance is a framework for how to think of and instil Trustworthiness, Quality and Value into your work on models.

This is not technical guidance for designing models

We do not provide technical or methodological guidance since we are not a technical organisation nor is this part of our role. If you require technical support, you should contact relevant organisations, some of which are listed in Annex B.

How to use this guidance

The best time to use this guidance is before you plan to develop or use a model: it focuses on instilling public confidence and trust, which need to be considered from the very start to ensure success. This guidance can also be used retrospectively to understand how to strengthen TQV in your model. The guidance is split into two parts: planning a model that serves the public good, and developing and using a model that serves the public good. To help you think about how your model can meet the highest standards of TQV, we have created tick box statements throughout this guidance which can also be found in a separate, downloadable guidance sheet.

This is the final version of this guidance. It supersedes version one, which was published in October 2021. This new guidance reflects feedback we received from external stakeholders on version one, as well as OSR’s recent work on Securing public confidence in algorithms and our QCOVID case study. We would, however, still welcome feedback so please get in touch with regulation@statistics.gov.uk if you would like to share your thoughts and use cases.

Back to top
Download PDF version (346.69 KB)