Securing public confidence in algorithms – lessons from the 2020 exam awards

Last March, schools and colleges across the UK were closed and the qualification regulators (Ofqual, Qualification Wales, SQA and CCEA) were directed by the respective governments of England, Wales, Scotland and Northern Ireland to oversee the development of an approach to awarding grades in the absence of exams. While each regulator developed a different approach, all approaches involved statistical models or algorithms.

When grades were awarded to students last summer there was intense media coverage across the UK. We read headlines such as, “Futures sacrificed for the sake of statistics” and statements implying that a “mutant algorithm” was to blame. The decisions that were made based on the calculated grades had a significant impact on many children’s lives. Up until last summer, most people had probably never felt personally affected by a government algorithm before.

Statistical models and algorithms though are increasingly becoming a part of normal life. As technology and the availability of data increases, developing these types of models in the public sector can play a significant role in improving services for society.

We are concerned that public confidence in the use of statistical models has been damaged by the exam processes last year. As Ed Humpherson, our Director General, said recently, “Public debate about algorithms veers between blind optimism in their potential and complete rejection. Both extremes have been very evident, at different times, in the discussion of the use of models to award exam grades.”

Our review

In the post 2020 exams world, the risk of a public backlash when public sector decisions are supported with an algorithm feels much more real now – regardless of how ‘right’ the algorithm may be.

As the UK regulator of official statistics, it is our role to uphold public confidence in statistics. We believe that statistical models have an important role to play in supporting decision making and that they can also command public confidence. It is because of this that we reviewed the approaches taken last year to award grades, to get a better understanding of what happened and, most importantly, what others can learn from the experience. To us it was striking that, though the approaches and models in England, Wales, Scotland and Northern Ireland had similarities and differences, all four failed to command public confidence.

Our recently published report explores what we found. We acknowledge the unique situation that the regulators were working in which was far removed from their normal roles. We conclude that many of the decisions made supported public confidence. To support learning for others, we have also clearly highlighted in the report the areas where we feel different choices could have been made.

We found that achieving public confidence is about much more than just the technical design of the model. It is also not just about doing one or two key things really well. It stems from considering public confidence as part of an end-to-end process, from deciding to use a statistical model through to deploying it. It is influenced by a number of factors including the confidence placed in models, the extent of public understanding of the models, the limitations of the models and the process they were replacing, and the approach to quality assuring the results at an individual level.

Lessons for those developing statistical models

We’ve identified 40 lessons for model developers which support public confidence. These fall under the following high-level principles:

Be open and trustworthy – ensure transparency about the aims of the model and the model itself (including limitations), be open to and act on feedback and ensure the use of the model is ethical and legal.
Be rigorous and ensure quality throughout – establish clear governance and accountability, involve the full range of subject matter and technical experts when developing the model and ensure the data and outputs of the model are fully quality assured.
Meet the need and provide public value – engage with commissioners of the model throughout, fully consider whether a model is the right approach, test acceptability of the model with all affected groups and be clear on the timing and grounds for appeal against decisions supported by the model.

Lessons for policy makers who commission statistical models

Model developers should not work in isolation. They should collaborate with others throughout the end-to-end development process. It is important that policy makers who commission models are part of this wider collaboration.

A statistical model might not always be the best approach to meet your need. Model commissioners need to be clear what the model is aiming to achieve and whether it meets the intended need, understanding the model’s strengths and limitations and being open to alternative approaches.

Statistical models used to support decisions are more than just automated processes. All models are built on a set of assumptions. Commissioners of models should ensure that they understand these and provide advice on the acceptability of the assumptions and other key decisions made in the model development.

The development of a statistical model should be regarded as more than just a technical exercise. Commissioners need to work with model developers throughout the end to end process and have regular reviews to check the model will meet the policy objective.

Lessons and recommendations for the centre of government

We also looked at the big picture, at the wider infrastructure that is in place to support public bodies working with statistical models. Looking at this, we found that public bodies developing models need more guidance and support and that this should be easier to access.

There are lots of different organisations in the statistical modelling, algorithm and AI space. As a result, it is not always clear what guidance is relevant to whom and where public bodies can go to for support. Some of this is down to inconsistencies in the terminology used to describe models, and some is due to it simply being hard to find out who is doing what in this very fast-moving area.

At the highest level, we feel that clearer leadership is needed from government. We are calling on the Analysis Function and Digital Function, working with the administrations in Scotland, Wales and Northern Ireland, to ensure that they provide consistent and joined up leadership on the use of models.

To support this, we recommend that those working in this area collaborate and develop a central directory of guidance. We see the Centre for Data Ethics and Innovation having a key role in this space. In addition, we are recommending that more guidance is developed, particularly for those wanting to test the public acceptability of their models.

And lastly, we recommend that any public body developing advanced statistical models with a high public value should consult the National Statistician for advice and guidance.

Our role going forward

This review supports the work of our Automation and Technology work program. Through this programme we will be clarifying our regulatory role when statistical models and algorithms are used by public bodies. We are working on guidance about models and AI in the context of the Code of Practice for Statistics and advocating for the use of automation within statistical production. We will continue to work with other organisations to support the implementation of the review findings.

We are happy to discuss the findings of our review further. Please get in touch with us.