Data – Office for Statistics Regulation

How Alan Turing’s legacy is inspiring our work today

To coincide with his birthday, on 23 June 2021, the UK honoured the life and work of Alan Turing, one of its most famous mathematicians, by featuring his image on the design of its latest £50 note.

Although best known for his codebreaking work at Bletchley Park, Turing’s legacy goes far beyond his contributions during the war. Recognised by many as an early pioneer of modern computing, his work on algorithms, computing machinery and artificial intelligence have changed the way we live today.

The early 1900s was the start of a data revolution in the UK. Punch cards were being used to input data into early computers, and statistics and science were opening the door to a world of technological possibility. This time of rapid discovery and progress was of immense inspiration to Turing, who foresaw that automation and ‘computing machinery and intelligence’ would have a huge impact on the world and the way we live in it. In fact, the new £50 note features a quote from Turing saying, ‘This is only a foretaste of what is to come, and only the shadow of what is going to be.’

We at OSR use modern advancements in Turing’s work by applying Machine Learning methods to gather data on statistics from places like Twitter, Government websites, parliament reports and the media in order to support us in our regulatory work. This automation helps us to gather large amounts of important data, from a multitude of sources that we wouldn’t have been able to capture before. These techniques will allow us to find where and how statistics are being used in the public domain and help shape our future work.

But although we can see direct impacts of Turing’s legacy in how we conduct some of our own work, perhaps it is the attitude he had to his work that should inspire us more.

Turing was a pioneer in his field, not just because of his keen mind, but because of the way he approached problems with both intellectual bravery and pragmatism. He was not afraid of huge radical ideas, but he was also able to think of how they be used practically, for the betterment of humanity. The idea of public good was at the forefront of his work, helping him to approach old problems with a new, novel perspective.

Making the world a better place has always been a driver to mathematical and scientific discovery, and Turing was a great proponent of applying maths and statistics to solve real world problems. At Bletchley park and beyond, he was able to show the value of accumulating data and how important statistics are to making informed decisions. This type of thinking; calculating probabilities and using plausible reasoning to make decisions, has been a major influence on the way governments around the world have used data to tackle the coronavirus pandemic.

He spent a lot of his time theorising about the concept of intelligence and how it applied to both humans and machines, but even he knew his own intellectual limitations. In his pursuit of knowledge and answers, he often spoke with people from differing fields to his own, discussing problems with philosophers for example. He knew that collaboration only added strength to problem solving and that working together with others would lead to better outcomes.

It is impossible to discuss the life of Alan Turing without remembering that he was persecuted for being gay. Although applauded for his intelligence and work during the war, he was arrested because of his sexuality and forced to take experimental medications. He ended his own life soon after his conviction.

It would not be a huge stretch of the imagination to think that, had Alan Turing’s life not ended prematurely, he would have continued to make intellectual discoveries that would have further positively impacted the world today. It is also not hard to imagine that there are many other diverse, intelligent minds that should have equal chance to contribute to solving the world’s problems. Turing’s life should inspire us not only to new intellectual heights, but to stronger commitments to equality and diversity as well.

As a regulator of statistics, the links from our work to Turing’s are many. Not only do we use automation to help us gather and analyse large data sources, but we question the methodology and fairness behind algorithms and are also passionate collaborators, seeking input from others as an integral part of our processes rather than just an afterthought.

Just as Turing was driven to use mathematics to tackle real world problems and benefit humanity, the accurate use of data and statistics to make decisions for the public good is at the heart of everything we do. Going forward, not only will we continue to explore new ways in which automation can aid our work, but we will strive to collaborate with diverse minds, continue to teach the importance of transparency, quality and value and above all protect the public’s interest by making sure they have statistics they can trust.

Thank you to the Alan Turing institute whose talk, Breaking the code: Alan Turing’s legacy in 2021, helped inform the content of this blog.

An analyst’s job is never done

‘Don’t trust the data. If you’ve found something interesting, something has probably gone wrong!’ Maybe you’ve been there too? It was a key lesson I learnt as a junior researcher. It partly reflected my skills as an analyst at the time – the mistakes could well have been mine! But, not entirely.

You see I was working with cancer registration and deaths data which on occasion could show odd patterns due to changes in disease classifications, diagnosis developments or reporting practices. Take a close look and you could spot the step changes when a classification change occurred. Harder to spot might be the impact of a new treatment or screening programme. But sometimes there were errors too – including the very human error of using the wrong population base for rates.

I was reminded of this experience when Sir Ian Diamond, the National Statistician, spoke to the Health and Social Care Select Committee in May. He said (Q34):

“One of the things about good statisticians is that they are always just a little sceptical of the data. I was privileged to teach many great people in my life as an academic and I always said, “Do not trust the data. Look for errors.””

Sage advice from an advisor to SAGE!

The thing with quality is that the analyst’s job is never done. It is a moving target. In our Quality Assurance of Administrative Data guidance, we emphasise the importance of understanding where the data come from, how and why they were collected. But this information isn’t static – systems and policies may alter. And data sources will change as a result.

Being alert for this variation is an ongoing, everyday task. It includes building relationships with others in the data journey, to share insight and understanding about the data and to keep a current view about the data source. As Sir Ian went on to point out in his evidence, it should involve triangulating against other sources of data.

OSR recently completed a review of quality assurance in HMRC, at the agency’s invitation. It was a fascinating insight into the operation of the organisation and the challenges it faces. We used a range of questions to help inform our understanding through meetings with analytical teams. They told us that they found the questions helpful and asked if we would share them to help with their own quality assurance. So, we produced an annex in the report with those questions.

And we have now reproduced the questions in a guide, as prompts to help all statistics producers think about their data and about quality under these headings:

Understanding the production process
Tools used during the production process
Receiving and understanding input data
Quality assurance
Version control and documentation
Issues with the statistics

The guide also signposts to a wealth of excellent guidance on quality on the GSS website. The GSS Best Practice and Impact Division (BPI) supports everyone in the Government Statistical Service in meeting the quality requirements of the Code and improving government statistics. BPI provides a range of helpful guidance and training.

Quality Statistics in Government guidance is primarily intended for producers of statistics who need to ensure that their products meet expectations for statistical quality. It is an introduction to quality and brings together the principles of statistical quality with practical advice in one place. You will find helpful information about quality assurance of methods and data and how to design processes that are efficient, transparent and reduce the risk of mistakes. Reproducible Analytical Pipelines (RAP) and the benefits of making our analysis reproducible is also discussed. The guidance complements the Quality Statistics in Government training offered by the GSS Quality Centre.
Communicating quality, uncertainty and change guidance is intended for producers of official statistics who need to write about and communicate effectively information about quality, uncertainty and change. It can be applied to all sources of statistics, including surveys, censuses, administrative and commercial data, as well as estimates derived from a combination of these. There is also a Communicating quality, uncertainty and change training.
The GSS Quality Centre has developed a guidance which includes top tips to improve the QA of ad-hoc analysis across the GSS. Moreover, the team runs the Quality Assurance of Administrative Data (QAAD) workshop in which users can get an overview of the QAAD toolkit and how to apply it to administrative sources.
There is also a GSS Quality strategy in place which aims to improve statistical quality across the Government Statistical Service (GSS) to produce statistics that serve the public good.

Check out our quality question guide and let us know how you get on by emailing me at penny.babb@statistics.gov.uk – we would welcome hearing about your experiences. We are always on the look-out for some good examples of practice that we can feature on the Online Code.

Covid-19: The amazing things that statisticians are doing

Stories of extraordinary human feats abound in this pandemic. They include the efforts of health and care professionals and personal commitments to support others in the community, perhaps best shown by Captain Tom Moore.

Statisticians are not on the front line of dealing with the impacts of Covid-19. Yet it is clear that one of the battlefields on which the fight against the pandemic is being fought is a statistical one. Slowly and painfully data about the virus and its behaviour are accumulating, and, sometimes working through the night, statisticians are making sense of that data. Creating models for what would happen under different policies, statisticians have provided real-time insight to political decision makers on the pandemic and its social and economic impacts.

More importantly, the progress of the pandemic has been communicated to the public through data and statistics. The value of trustworthy information is emerging as one of the stories of this convulsive experience.

Statisticians in the health sector have built dashboards for the UK, England, Scotland, Wales and Northern Ireland to provide daily updates to the public. Their colleagues who work on population statistics have provided weekly updates on deaths, which represent the most complete measure of the mortality impact of Covid-19 (published for England and Wales, Scotland and Northern Ireland). These weekly statistics have developed at an unprecedented pace to provide more detailed insight, for example on deaths in care homes.

Beyond that, statisticians at the Office for National Statistics (ONS) have produced rapid insights into the population’s behavioural responses and into the impact on the economy (through its faster economic indicators weekly publication). ONS has also published in-depth analysis, such as its striking findings on the relationship between mortality and deprivation.

Similar efforts are being made by statisticians in other Government departments across the UK, highlighting impacts on areas like transport, and education in England and Wales. These outputs require new, often daily, data collection, and would have seemed incredibly radical only a couple of months ago. And researchers outside Government have also worked at amazing speed, using data published by Government statisticians to highlight emerging issues within weeks – for example the Institute for Fiscal Studies research on ethnicity.

Perhaps most impressive, ONS is now in the field with a household survey that tests for whether people have had the virus already. This testing holds one of the keys to understanding the pandemic. The ONS, working with partners at the Department of Health and Social Care, the University of Oxford and IQVIA, has used its expertise in household surveys to develop the survey.

What have we been doing at the Office for Statistics Regulation? We set out our aims here: we committed to support producers of statistics as they provide the best possible information to the public. We have:

granted a number of exemptions to the Code of Practice so that producers can reach audiences effectively;
conducted a series of reviews to provide endorsements to the approach adopted for new outputs;
held discussions on the core Covid-19 data: the daily dashboards and weekly deaths. We have particularly focused on making sure the differences between the two are clear: the daily dashboards provide a leading indicator, while the weekly deaths provide a more complete picture, albeit with a time lag. There is still a need to provide a coherent overview, though, and at OSR we will continue to press for improvements in coherence.

And we have also maintained our commitment to standing up for need to publish data. We have written to the Department for Work and Pensions (DWP) about publishing information on Universal Credit claims in the pandemic. And we have written to the Department of Health in Northern Ireland (DoHNI) calling for the resumption of daily dashboards. In both cases the Departments responded well. DWP has committed to publish on Universal Credit, and DoH in Northern Ireland has resumed the daily dashboard.

The efforts of statisticians are in some ways quieter, and less visible, than the work of health and care professionals and people in the food and retail sectors. But the work of statisticians to inform the public is crucial. I hope this blog represents a quiet form of celebration.

Joining Up Data

Jeni Tennison, CEO of the Open Data Institute, responds to our Joining Up Data for Better Statistics report.

Data is moving from being scarce and difficult to process to being abundant and easy to use. But harnessing its value for economic and social benefit – in ways that support innovation and deliver social justice – is not straightforward.

At the Open Data Institute (ODI), we would like to see a future where people, organisations and communities use data to make better decisions, more quickly. This would help our economies and societies to thrive. Using data and statistics well underpins research; enables us to innovate; informs the creation of more effective products, services and policies; and fuels discovery, economic growth and productivity.

In the future we would like to see, people can trust organisations to manage data ethically and benefits arising from data are distributed fairly. Data is used to meet the needs of individuals, communities and societies.

The Joining Up Data for Better Statistics review from the Office for Statistics Regulation (OSR) focuses on an essential part of this open, trustworthy data ecosystem: how to safely link together and share data from across different data stewards for analysis, research and generating statistics.

Data as roads

At the ODI, we often use the analogy of data being like roads. Where we use roads to navigate to a location, we use data to navigate to a decision.

The road analogy highlights the importance of joining up data. A single road only takes us to places between two locations; their real value comes from being part of a network. Data works in the same way: it is not just having more data that unlocks its value, but linking it together. Data is not individual datasets, it is a network: a data infrastructure.

We can apply the ‘data as roads’ analogy to the Code of Practice for Statistics’ three pillars:

Roads are valuable when they go to places people want to go to; similarly, data and statistics add value when they help answer society’s questions.
Well-paved roads help us travel more quickly, but even rough tracks can be useful if you have the right vehicle – you need to know what to expect when you’re planning a journey; similarly, high-quality data is best, but lower quality data can be useful if you are aware of its limitations when drawing conclusions.
To avoid danger, we rely on engineers to use good practices to build and maintain roads, bridges and tunnels and on road users obeying the rules of the road; similarly, we rely on data custodians and data users to collect, maintain, use and share data in trustworthy ways.

Open and trustworthy

Like our road infrastructure, for our data infrastructure to generate value it has to be both as open as possible and trustworthy.

Data is more useful when more people can access and use it. It is most useful when it can be joined together. Data that is inaccessible – or where access takes so long it is rendered irrelevant – is of limited utility.

At the same time, greater access and linkage – particularly with personal data – can increase the potential for harmful impacts. The result of unethical, inequitable and untransparent use of data goes beyond direct impacts on affected individuals: it can undermine trust more widely, causing people to withdraw consent.

This ultimately affects the quality and representativeness of the data we have, the data we need to understand our populations, to meet their needs, and to innovate.

As the OSR’s review highlights, there is still much to do to increase both data’s openness and its trustworthiness. We need better technical guidance and approaches, through data trusts perhaps, but we also need to upskill data stewards so they can understand and weigh risks and benefits, quickly and well.

We are still learning how to share and join up data in open and trustworthy ways. Being open and transparent about the decisions we make as we use and share data can build trust and speed up this learning, so we can all benefit from data.

Joining Up Data for Better Statistics

To speak to people involved in linking Government datasets is to enter a world that at times seems so ludicrous as to be Kafkaesque. Stories abound of Departments putting up arcane barriers to sharing their data with other parts of Government; of a request from one public sector body being treated as a Freedom of Information request by another; and of researchers who have to wait so long to get access to data that their research funding runs out before they can even start work.

Our report, Joining Up Data for Better Statistics, published today, was informed by these experiences and more.

The tragedy is that it doesn’t have to be this way. We encountered excellent cases where data are shared to provide new and powerful insights – for example, on where to put defibrillators to save most lives; how to target energy efficiency programmes to reduce fuel poverty; which university courses lead to higher earnings after graduation. These sorts of insight are only possible through joining up data from different sources. The examples show the value that comes from linking up data sets.

This points to a gap between what’s possible in terms of valuable insights, especially now the Digital Economy Act creates new legal gateways for sharing and linking data, and the patchy results on the ground.

It leads us to conclude that value is being squandered because data linkage is too hard and too rare.

We want to turn this on its head, and make data linkage much less frustrating. We point to six outcomes that we see as essential to support high quality linkage and analysis, with robust safeguards to maintain privacy, carried out by trustworthy organisations including the Office for National Statistics (ONS) and government Departments. The six outcomes are that:

Government demonstrates its trustworthiness to share and link data through robust data safeguarding and clear public communication
Data sharing and linkage help to answer society’s important questions
Data sharing decisions are ethical, timely, proportionate and transparent
Project proposal assessments are robust, efficient and transparent
Data are documented adequately, quality assessed and continuously improved
Analysts have the skills and resources needed to carry out high-quality data linkage and analysis

The report seeks to make things better. The six outcomes are the underpinnings of this. The report supports them with recommendations designed to help foster this new, better environment for trustworthy data linkage. The good news is that there is a strong coalition of organisations and leaders wanting to take this forward both inside and outside Government. This includes the National Statistician and his team at ONS, strong data linkage networks in Scotland, Wales and Northern Ireland, and new bodies like the Centre for Data Ethics and Innovation, UK Research and Innovation and the Ada Lovelace Institute. Alongside this blog we’re publishing a blog from Jeni Tennison, CEO of the Open Data Institute, which shows the strong support for this agenda outside Government.

We want statistical experts in Government, and those who lead their organisations, to achieve the six outcomes. When they do so, they will ensure that opportunities are no longer squandered. And the brilliant and valuable examples we highlight will no longer be the exception: analysts will be empowered to see data linkage as a core part of their toolkit for delivering insights.

Improving and innovating: enhancing the value of statistics and data

Lessons from statisticians producing Children, Education and Skills statistics

Statistics are of value when they support society’s need for information; they should be useful, easy to access, remain relevant, and support understanding of important issues. To help deliver this, producers of statistics should commit to continuously improve their service to users.

I have been part of the team working on the refresh of the Code of Practice of Statistics. There have been various changes within the Code but without a doubt the area which I am most excited to see enhanced is the new Innovation and improvement principle. At the Office for Statistics Regulation we have always expected producers of statistics to adapt so statistics can better serve the public but now this expectation is crystallised in the code.

During conversations about the development of the Code I received several questions about this area and I felt there was a sense of nervousness about how it might be applied; this is understandable with anything new. The new principle is about having a positive mindset to change and improvement across all statistics production and dissemination processes. However, the practices which sit beneath the principle are not a prescriptive list of what must be done instead they should be applied proportionately depending on the statistics in question. As a result, how producers respond to this principle will differ in scale and approach. What is most important is producers’ motivation to improve their statistics.

I was keen to undertake a small project to help support producers of statistics get a better handle on what the Innovation and improvement principle meant for them. My colleague Louisa and I both focus on Children, Education and Skills (CES) statistics. This thematic way of working gives us the opportunity to better understand policy and statistics issues, and develop relationships with a range of users and producers of CES statistics. From our ongoing conversations we were aware of several innovations in this area of statistics, such as work to develop the Longitudinal Education Outcomes data which is relatively well known about. We wanted to find out more about other projects however, to learn about less well publicised developments or smaller scale projects which, nonetheless, reflect an ambition to improve the value of the statistics.

We started by asking producers of CES data and statistics across the UK to send us information on the projects they had been working on. We were pleased by the range of responses we received. The projects, whether completed or still in development, varied in scale and covered everything from producing statistics using Reproducible Analytical Pipelines to improved accessibility to data. It was clear to us that improvement, whether big or small, was embedded in much of the activities of producers – it was great to hear just how enthusiastic producers were about their projects. We also spoke with users to get their feedback on some of the development work, to find out how they have benefited from the improvements being made. Here is a link to a summary of the innovation and improvement projects producers told us about.

Over the coming weeks we want to share with you some of the common themes that became apparent from talking with producers and users linked to these projects. Firstly, we want to look at the importance of collaborative working when developing statistics, then at the development of alternative statistical outputs, and finally at some of the common challenges producers face when improving their statistics. While these themes have come from examples taken from Children, Education and Skills statistics it is intended to give all producers of statistics a better sense of what the new Innovation and improvement principle might mean to them and highlight elements of good practice we might expect to see when assessing statistics.

As this review is considering innovation in statistics, we ourselves wanted to be more creative in thinking about how we would share our findings. Instead of a more traditional report, we are going to publish our work across a series of web posts. We will also be exploring, with the Government Statistical Service’s Good Practice Team, how else we might support producers undertaking innovation and improvement work.

For now, keep an eye out for our forth coming posts, and if you want to get in touch, linked to this review or on CES statistics more generally, please do email.

Marie

The (almost) mental health ‘data revolution’

Mental health has been in the forefront of news and debate for a while now. This continued focus can only be a good thing for those needing better care, the workforce in the field, and the services provided.

In 2016, the NHS England ‘Five Year Forward View for Mental Health’ called for a data and transparency revolution. I loved this call to arms – it simply and passionately inferred that without data to monitor and hold the health service system to account, a disservice to mental health patients would occur.

So, that was then. Where are we now? A bit further along Revolution Road but not there yet.

We have more data collected and published providing insight on mental health services and prevalence of mental health disorders, which is great. For example, in 2015 it was difficult to know how much NHS England spent on mental health services, but in 2018 we can find a figure for this in the Mental Health Five Year Forward View Dashboard. Statisticians have also been working hard to create a Mental Health Services Dataset, which can provide greater insight into the services provided for adults and children accessing mental health care. And I look forward to the publication of the National Study of Health and Wellbeing, providing the much needed updated prevalence estimates of mental health disorders for children and young people in England.

What more can be done to deliver the revolution? Well, we know that being transparent helps demonstrate trustworthiness. Trust in numbers isn’t built by simply publishing data. More is needed. The data need to be easily found, contextual information is needed to enable everyone can assess the data in terms of accuracy, and everyone should be able to understand and easily use the data.

But, when looking at mental health data in England it is difficult to make sense of them.

This point was demonstrated after a Twitter debate between the Secretary of State for Health and Social Care and actor Ralf Little. Some claims were contradictory and it was difficult to determine why. Other claims couldn’t be verified because data weren’t published – either in the breakdown needed or at all. These difficulties were all brought to light by Full Fact’s work to communicate to the public a fair and accurate verification of the data used. But these difficulties indicate to me that statisticians need to do more to help people find, understand and use these important data. And by people, I mean everyone because often it is the needs of experts that are catered for, and other’s needs are not.

Sir David Norgrove and I have had to intervene on several occasions in the last three months asking statisticians to make their data more accessible (as listed at the end of my blog), assessable, and useable (in the words of Baroness Onora O’Neill). Statisticians have responded well to our interventions, for example by providing new insight into the mental health workforce and NHS spending on mental health services. And these changes will hopefully enhance future public debates and discussions about mental health.

But, I would rather that improvements like these were made without our intervention. A data and transparency revolution requires much more to be done by the different organisations that publish mental health data to enable people to find, understand, and use such important data.

LIST OF OUR INTERVENTIONS

I asked NHS Digital to publish more accessible data on the mental health workforce and they have now published new experimental statistics, which I encourage you to provide feedback on.
I asked NHS England to publish new data to understand mental health spending and they have now published these data.
Asking for the annual survey of liaison psychiatry in England to be published, which Health Education England has now done.
Asking NHS Digital and NHS England to improve the official data on the outcomes from the Early Intervention in Psychosis

A robot by any name?

My big problem with my favourite innovation of the year – the Reproducible Analytical Pipeline (RAP) – is this: what should I call it for my end of year blog? The full name is a mouthful, yet its acronym doesn’t give much of a clue to what it does.

I wanted to name it Stat-bot. I imagine a cute little droid about 3 feet tall, soppy and warm-hearted, buzzing around the Government Statistical Service dispensing help and advice wherever humans need it. But the Office for Statistical Regulation due diligence department* reviewed the blog and pointed out the existence of a commercial product with this name (also, I’m probably overly influenced by my children, who find anything including the word ‘bot’ automatically hilarious). I therefore edited this blog and used a variety of alternative imaginary names for the product.

I heard about, er, Auto-stat from Steve Ellerd-Elliot, the excellent Head of Profession for Statistics at the Ministry of Justice (MOJ). He was describing their new approach to producing statistical releases and their associated commentary using the Reproducible Analytical Pipeline (RAP).

This new approach, developed in partnership with the Government Digital Service, involves automating the process of creating some of the narrative, the highlights, the graphs and so on. It’s based on algorithms that work the basic data up into a statistical release. To find out more about how RAP works, read the Data in Government blog and this follow-up post. And to be clear, it’s not just Steve and his MoJ team that are using this approach – it was developed in the Department for Digital, Culture, Media & Sport and has been picked up by the Department for Education, amongst others. The Information Services Division in Scotland have developed a similar tool.

Like the statistical R2D2 of my imagination, this approach helps human statisticians, and in two really important ways. Firstly, Stat-O reduces the potential for human error – transposition and drafting mistakes and so on. But more significantly, robostat (?) frees up a massive amount of time for higher level input by statisticians – the kind of quality assurance that spots anomalous features in the data, narrative that links up to other data and topics, and adds human interest to the automated release.

The other thing about … statomatic? … is that it is just the most eye-catching of a broader range of innovations Steve and his colleagues at MoJ have brought to statistics in recent months. They include:

a new portal for prisons data, with embedded data visualisations that radically extends what the existing gov.uk web platform can host;
an associated suite of Justice data visualisation tools that are freely available to users: and
new developments within the Justice Data Lab to allow a wider range of analysis, with the pilot of a Justice MicroData Lab to open up access to the data.

When we launched the Office for Statistics Regulation we aimed to stand up for the public value of statistics. To set the standards for producing and publishing statistics; to challenge when these standards are not met, and celebrate when they are. I hope we’ve balanced challenge and celebration in a sensible way through the year and through our Annual Review.

But it’s often the way of things that the challenge attracts the most attention. So I think it’s appropriate for me to make my final blog of 2017, in what is after all a season of celebration, something of a toast to — er — I mean, a toast to — um– oh well, a toast to RAP.

*the due diligence department doesn’t actually exist; it was a colleague in the pub who told me about the commercial product.

Data, quality and the Code

Having spent much of the past two years talking about how to approach quality assuring administrative data, my thinking about the Quality pillar of the refreshed Code was firmly grounded on our Quality Assurance of Administrative Data (QAAD) framework (check it out here for pointers and case examples).

But the Quality pillar is more than that, and respondents to our consultation rightly pointed out that our draft Code had not gone far enough to address the practices to other data types. So we have revisited these principles, to make them more widely applicable and also to be simpler.

The Code and Quality following the Consultation

The structure of the Quality pillar is essentially the same as in the draft Code, based around a basic statistics process model: Suitable Data Sources; Sound Methods, and Assured Data Quality. But we are thinking carefully about how the principle of coherence fits into that model. We absolutely agree with the importance of the practices covering coherence, consistency and comparability – we tend to think, though, that they will be clearer when integrated into the relevant principles.

So, for example, the practice about internally coherent and consistent data fits with the principle on Suitable Data Sources. And a practice around using harmonised standards, classifications and definitions fits with the Sound Methods principle.

In fact, we are considering different aspects of coherence across the three pillars:

We are adding an emphasis on promoting coherence and harmonisation into the Heads’ of Profession for Statistics role in Trustworthiness
In Value, the Insightful principle promotes explaining consistency and comparability with other related statistics
We are emphasising the use of consistent and harmonised standards when collecting data in the Efficient Data Collection and Use principle in Value, as these support data integration and the more efficient use of data

The European Statistical System Five Dimensions of Quality

Another area that received a lot of comment in the consultation regarded our definition of quality when compared with the Quality Assurance Framework of the European Statistical System (QAF).

QAF presents five quality dimensions: relevance, accuracy and reliability, timeliness and punctuality, coherence and comparability, and, accessibility and clarity.

We completely agree with the importance of these dimensions but our structure of Trustworthiness, Quality and Value frames them in a way that helps relate the practice to the outcome we are seeking:

We see ‘relevance’ and ‘accessibility and clarity’ as central to our Value pillar. They are critical aspects of providing information to support decision making
We see ‘timeliness and punctuality’ and ‘coherence and comparability’ as cross-cutting each of the pillars – they speak to organisational processes and policies, meeting the needs of users for timely and comparable information, as well as relating directly to the nature of the quality of the statistics
We see accuracy and reliability as central to our Quality pillar; they inform each of the principles. We have revised the principle ‘Assured Data Quality’ to reflect the need for quality indicators to cover the areas of timeliness and coherence, as well as accuracy.
Producers should also monitor user satisfaction with each of the five quality dimensions under Value principle, Reflecting the Range of Users and Uses.

By regularly monitoring the quality indicators for the five quality dimensions and reporting them transparently, statistics producers can reassure users of the suitability of the statistics to meet their intended uses.

Health statistics

In the last few weeks, we’ve made three comments on health statistics – one in England, about leaks of accident and emergency data; one in Scotland, on statistics on delayed discharges; and one on analysis at the UK level. They all show the importance of improving the public value of statistics.

On accident and emergency statistics, I wrote to the heads of key NHS bodies in England to express concern about recent leaks of data on performance.

Leaks of management information are the antithesis of what the Office for Statistics Regulation stands for: public confidence in trustworthy, high quality and high value information.

It’s really hard to be confident about the quality of leaked information because it almost always lacks context, description, or any guidance to users. On value, leaked information usually relates to a question of public interest, but it’s not in itself valuable, in the sense it’s not clear how it relates to other information on the same topic. Its separated, isolated nature undermines its value. And it’s hard for leaked information to demonstrate that it is trustworthy, because the anonymous nature of the “producer” of the information (the person who leaked it) means that motives can be ambiguous.

But leaks can highlight areas where there is concern about the public availability of information. And that was the constructive point of my letter: the NHS bodies could look into reducing the risk of leaks. One way of doing this would be to reduce the time lag between the collection of the information on accident and emergency performance, and its publication as official statistics. This lag is currently around 6 weeks – 6 weeks during which the performance information circulates around the health system but is not available publicly. Shorten this lag, I argue, and the risk of disorderly release of information may also reduce.

The comments on Scotland relate to the comparability of statistics across the UK. When NHS Scotland’s Information Services Division published its statistics on delayed discharge from NHS hospitals for February, the Cabinet Secretary for Health and Sport in the Scottish Government noted that these figures compared positively to the equivalent statistics in England.

This is of course an entirely reasonable thing for an elected representative to do – to comment on comparative performance. The problem was that the information ISD provided to users in their publication on how to interpret the Scottish statistics in the UK context was missing – it wasn’t clear that Scotland figures are compiled on a different basis to the England figures. So the comparison is not on a like for like basis. The difference wasn’t stated alongside the equivalent statistics for England either. This clarification has now been provided by ISD, and NHS England have agreed to make clearer the differences between the figures in their own publication.

For us, it’s really important that there is better comparability of statistics across the UK. While there are differences in health policy that will lead to different metrics and areas of focus, it’s quite clear that there is public interest in looking at some issues – like delayed discharge – across the four UK health systems.

In this situation, good statistics should help people make sound comparisons. Yet, with health and care being a devolved matter, there are some constraints on the comparability of statistics across England, Wales, Scotland, and Northern Ireland. And, to the untrained eye it is difficult for users to know what is or is not comparable – with delayed discharge data as a prime example. This is why we really welcome the recently published comparative work, led by Scottish Government, where statisticians have created a much more accessible picture of health care quality across the UK, pulling together data on acute care, avoidable hospital admissions, patient safety, and life expectancy/healthy life expectancy across all 4 UK countries.

Both these cases – the leaks and comparability – illustrate a broader point.

Health statistics in the UK should be much better. They should be more valuable; more coherent; in some cases more timely; and more comparable. If statistics do not allow society to get a clear picture in good time of what is going on, then they are failing to provide public value.