Introduction

 This guidance is in alpha phase, meaning that we welcome any comments and feedback from everyone. Please get in touch with regulation@statistics.gov.uk. We aim to release an updated version of this guidance in early 2022 based on a review of this work.

The Office for Statistics Regulation published version two of the Code of Practice for Statistics in February 2018. It sets the standards that producers of official statistics should commit to through a series of principles and practices within each pillar of trustworthiness, quality, and value. 

As the Code of Practice foreword highlights though, it “is not concerned only with official statistics. It provides a framework that can apply to a much wider range of data that have not traditionally been described as official statistics. Providers of these other types of data can draw on the Code of Practice as they judge appropriate to help support public confidence”. 

This document provides guidance on how the principles in the Code of Practice can support good practice in designing, developing and using models.  

Here, a model is defined as a tool used to create statistics, or to extract meaning from data for decision making. In this guidance, it primarily implies a data science or statistical model but could also be used to refer to an analytical model, conceptual model, mathematical model or data-driven model.  

The term algorithm may also be appropriate in places. An explanation of terms can be found in Annex A. 

As a model encompasses data, systems, and techniques, it is important that these components are not considered in isolation from one another. That is, considerations around data cannot exist without considerations of techniques; and neither exist independently of systems. It is this holistic perspective that the pillars of the Code of Practice provide. 

Who is this guidance for?

This guide is for anyone designing, developing, changing, reviewing or using models who wish to instil the principles of the Code of Practice. We also present some considerations for those thinking about using or changing models in the future. Users of this guidance may be those who are involved in:  

  • the design, creation or use of models to generate official statistics 
  • producing or using models to create new, exploratory statistics or testing new models for current statistics 
  • producing or using models to generate data used to inform decision making 
  • informing public policy using outputs from a model 
  • referencing statistics or data driven decisions generated by a model in the public domain 
  • reviewing and validating models used to generate statistics or data used to inform decision making 

What is the aim of this guidance?

The three Code of Practice pillars of trustworthiness, quality and value provide an excellent framework for using models to produce official statistics and beyond. All the principles of the Code of Practice are relevant, but we have sought to highlight areas that need more clarity. Like QAAD (Quality Assurance of Administrative Data), this guidance aims to clarify Office for Statistics Regulation (OSR) expectations. Here we clarify our expectations for those creating and using models who wish to instil the principles of the Code of Practice. We provide guidance for the use of models across the statistical system and beyond. 

What does this guidance not cover?

This is not a regulatory document for data science, machine learning or artificial intelligence (AI) 

Effective, and appropriate regulation requires cross-sector agreement. OSR engages with a broad range of interested organisations in areas such as AI. In addition, data science, machine learning and AI are fast moving, dynamic and complex fields. This guidance has been created with the aim of being relevant to any current or future technique. 

This is not a best practice technical guidance for creating models 

We do not provide technical guidance since we are not technical experts, nor is this part of our role as an organisation. If you require technical support, you should contact relevant organisations, some of which are listed in Annex B. In the future, we aim to release separate guidance to focus in greater detail on model data and methods. 

Why are we producing this now?

As stated in the Code of Practice: “Data need to be processed into useful information using statistical techniques, and then that information can, through its application to specific policy areas, provide insight and form the evidence base for key decisions”. Traditional statistical techniques, such as linear regressions, have long been used to create statistics or generate data used to inform decisions. However, recently, newer techniques including machine learning are being used in the production of statistics (for example, Using traffic camera images to derive an indicator of busyness: experimental research, ONS) and to inform decisions (for example, How the Ministry of Justice used AI to compare prison reports, MoJ). This has led us to think about public confidence in models in general.  

There have also been high-profile cases of models being used in decision-making within the public sector with mixed public acceptability. This was highlighted in our report on awarding GCSE, AS, A level, advanced extension awards and extended project qualifications in summer 2020. Misuse and a lack of transparency of models can undermine public confidence, especially in the statistics they produce and decisions they inform. As such, we have decided to show how the Code of Practice can be used to increase public confidence in models and improve uses of models for public good. 

How to use this guidance

The best time to use this guidance is during planning and design of a model. This guidance can also be used retrospectively to understand if a model meets the expectations of the Code of Practice. The guidance is split into two parts: planning and designing a model that serves the public good, and developing and using a model that serves the public good. Subsections of each part of the guidance relate to the Code of Practice and other relevant material. To understand if your use of a model meets the expectations of the Code of Practice, we have created check box statements throughout this guidance. These tick boxes can also be found in a separate, downloadable guidance sheet. We believe that this is valuable guidance on how to make and use models that command public confidence and acceptability. 

Back to top
Download PDF version (163.24 KB)