Don’t Ask for Trust in Statistics. Earn It.

In our latest guest blog, leading British statistician, David Spiegelhalter explores why trustworthiness—not trust—should be at the heart of statistical communication. Drawing on the influence of Baroness Onora O’Neill and reflecting on the updated Code of Practice for Statistics 3.0, he argues for intelligent transparency, honest communication, and a commitment to helping the public genuinely understand evidence. He also shares why it’s not enough for statistics to be trustworthy—they must be engaging too.

I give many talks to all sorts of audiences, from health professionals to business executives to attendees at book festivals. And, perhaps surprisingly, the Code of Practice for Statistics features in almost all of them (I do exclude school students from my propaganda).

This all comes from my obsession with the ideas of Baroness Onora O’Neill. She is a top philosopher, specialising in Kant, and she presented the Reith Lectures on A Question of Trust in 2002 – I still value the excellent book of her lectures. She was brilliant at distilling years of thought into short and clear statements, and one of these has had a huge influence on me, both professionally and personally.

In this age of misinformation and scepticism of authority, a repeated question is ‘how can we improve trust in science/institutions/public health etc?’. To which O’Neill replies, that’s the wrong question. Rather than trying to manipulate people into trusting us, we should be earning that trust by demonstrating trustworthiness. This is such a simple idea, presumably based on Kant’s idea of duty ethics (although I’ve never read any Kant), which places the responsibility firmly on the authority.

When I introduce this idea in a talk, many people in the audience take pictures of the slide, so I know it must be good. I then go on to show the Trustworthiness – Quality – Value (TQV) framework of the Code of Practice, showing Trustworthiness as the first pillar, although emphasising how important the Q and V are too. I feel I am channelling Baroness O’Neill.

I have recently had to update my slides with the new Code of Practice for Statistics 3.0. This rightly keeps to the basic core principles of TQV, which continue to form the basis for standards for official statistics. But I have been delighted to see the introduction of Standards for the Public Use of Statistics, Data and Wider Analysis. These focus on the way that statistics are communicated and used in public life, and are rooted in the idea of ‘intelligent transparency’ – incidentally another term introduced by Onora O’Neill. This includes equality of access and independence, but also enhancing understanding, which is my main interest.

Back in the pandemic in 2020, a group of us became very frustrated at the amount of frankly untrustworthy numbers being bandied around, by both politicians and commentators, so we tried to list what we thought were the vital components of trustworthy communication of evidence. Nature published our rant as a commentary, with our five points being essentially:

  1. Inform and not persuade
  2. Balance (but not false balance)
  3. Acknowledge uncertainty
  4. Be upfront about the quality of the evidence
  5. Pre-empt misunderstandings

These later got incorporated into the Government Communication Service RESIST 2 Counter-Disinformation toolkit.

The Code of Practice 3.0 contains the essence of these principles for trustworthy communication, for example saying:

  • Do present and use data and statistics objectively, being impartial and professional
  • Do clearly describe the quality of data and statistics, including uncertainty and bias in estimates and impacts on appropriate interpretation and use
  • Do not use statistics, data or wider analysis in a misleading way. This includes not cherry-picking figures, taking figures out of context or placing undue certainty on them.
  • Take proactive steps to prevent or minimise the risk of misinterpretation or misuse.

I feel particularly strongly about the last point. It’s not enough to suggest what the statistics mean, it is also vital to say what they do not This could be thought of as pre-empting misunderstanding, but also could pre-bunk misinformation – getting in there early before false claims start circulating.

There is one final issue that is not in the Code. When communicating, I believe that there is little point in being trustworthy if you are dull. While the information should not be trying to persuade people to think or do anything, I do feel that it is fine to try and persuade audiences to be interested – to engage in the evidence so that they can be better informed.

So I have a small suggestion for Code of Practice 4.0: don’t be dull.


Related

Code of Practice for Statistics

Lessons in communicating uncertainty from the Infected Blood Inquiry: What to say when statistics don’t have the answers

In this guest blog, Professor Sir David Spiegelhalter, Emeritus Professor of Statistics at the University of Cambridge, reflects on his experiences in the Infected Blood Inquiry and the importance of transparency around statistical uncertainty.

In my latest book, The Art of UncertaintyI discuss the UK Infected Blood Inquiry as a case study in communicating statistical uncertainty. In the 1970s and 1980s, tens of thousands of people who received contaminated blood products contracted diseases including HIV/AIDS and hepatitis. Many died as a result. This crisis, with its catastrophic consequences, was referred to as ‘the worst treatment disaster in the history of our NHS’.

The Infected Blood Inquiry was set up in 2018 after much campaigning by victims and their families. I was involved in the Statistics Expert Group established as part of the Inquiry.

Building a model for complex calculations

Our group was tasked with answering a number of questions surrounding the events, such as how many people had been infected with hepatitis C through contaminated blood transfusions.

Some conclusions were relatively easily reached. We could be reasonably confident in data and its verification, such as that around 1,250 people with bleeding disorders were diagnosed with HIV from 1979 onwards.

Other figures proved much more difficult to estimate, such as the number of people receiving ordinary blood transfusions who were infected with hepatitis C, before testing became available. We needed a more sophisticated approach that did not involve counting specific (anonymous) individuals but looked at the process as a whole. Consequently, we established a complex statistical model to derive various estimates. However, due to the lack of data available for some parts of the model, expert judgement was at times necessary to enable it, so we had to account for multiple sources of uncertainty.

Using this model, we were able to produce numbers that went some way to answering the questions we were charged with. However, some figures came with very large uncertainty due the inherent complexity involved in their calculation, so we could not be reliably sure of their accuracy.

A scale for communicating uncertainty

To prevent people from placing undue trust in our findings, we wanted to express the considerable caution that should be taken when considering our analysis. For this, we found the scale used in scientific advice during the COVID-19 pandemic to be a helpful model, in which confidence is expressed in terms of low through to high.

This scale was liberating; it allowed us to clearly convey our level of confidence in a way that accurately reflected the reality of the numbers. So, we could say that we only had moderate confidence that the available data could answer some of the questions we had been asked. And for others – for example, how many people had been infected with hepatitis B – we refused to provide any numbers, on account of having low confidence in being able to answer the question.

Lessons for the statistical community about communicating uncertainty

It can be difficult to admit to substantial uncertainty in data when dealing with a tragedy such as this. In the case of the Infected Blood Inquiry, this lack of clarity meant that the victims and their families were unable to have answered, in any precise way, various questions for which they deserved some kind of closure.

It is also undeniably important, however, that those producing statistics are open about how confident they are in their numbers, so that people understand when statistics can reliably answer their questions, and when they cannot. Indeed, being transparent about any uncertainty in published data is one of the principles that the Office for Statistics Regulation (OSR) promotes in its intelligent transparency campaign and its championing of analytical leadership to support public understanding of, and confidence in, the use of numbers by government.

Intelligent transparency demands that statistical claims and statements are based on data to which everyone has equal access, are clearly and transparently defined, and for which there is appropriate acknowledgement of any uncertainties and relevant context. This concept helps us understand how to communicate our findings when we are asked to answer questions regardless of the quality of available evidence. And it acknowledges that publishing numbers without appropriate context, clarifications and warnings is counterproductive to providing real public value.

So, when it comes to communicating statistics to the public, honesty – or transparency, as we call it here – really is the best policy. I am delighted to see OSR placing more emphasis on intelligent transparency, and how statistics are communicated more generally, in its proposals for a refreshed Code of Practice. Ed Humpherson has also written an excellent blog on why communicating uncertainty is a constant challenge for statisticians.