A reason to be optimistic: sharing and linking data on road traffic collisions

In our latest blog, Head of OSR Ed Humpherson discusses how data sharing and linkage can provide vital insight into the problems and potential solutions when looking at road traffic collision data.

At the start of 2025, OSR published a rather optimistic piece on the potential for data sharing and linkage. Data sharing and linkage can yield new insights, identify previously hidden problems, and highlight what works and what doesn’t. It has huge potential to serve the public good.

But it’s also difficult to achieve, and there are still lots of frustrated researchers who have not been able to progress their work because they can’t access the data that they need.

So why are we optimistic? Partly, it’s a top-down perspective: we’ve seen progress through the increasing maturity of the UK-wide facilitation of data sharing and linkage provided by the excellent Administrative Data Research UK, reflected in programmes like the Ministry of Justice’s Data First.

But it’s also because, in some specific policy areas, there is a growing bottom-up drive to make better use of datasets by linking them to others, and enhancing the insight that they can provide.

Developments in data on road traffic collisions provide the best grounds for my optimism. The Department for Transport (DfT) publishes a long-standing data set on road traffic fatalities. The statistics show that the UK does well in international comparisons of road traffic fatalities per capita. They are based on a consistent set of categories for recording traffic collisions by police forces in England, Wales and Scotland, using a system called STATS19. They are well presented and clearly explained.

But the STAT19 data set has some limitations. The data series does not capture all traffic collisions, nor does it record all injuries. And as with all data based on police recording, the incidents recorded are those that come to the police’s attention – and not all do. To its credit, DfT is clear about these limitations in its annual statistical release.

Moreover, the picture painted by the traffic fatalities statistics can hardly be described as positive. Every fatality is a personal tragedy, impacting the families and friends of those involved in a deep and difficult way. And the long-term declines in fatalities seem to have stalled over the last decade, as shown in Chart 1 in the annual report here:

Figure 1: All road users killed in traffic collisions in Great Britain, 1979 to 2023

The chart shows a decline in road users killed in traffic collisions in Great Britain from 1979, with the decline slowing from 2013 – 2023. The chart was originally published on the Department for Transport website. The data can be found here.

So, we should welcome anything that can give us more insight into the problems and potential solutions. This is where linked data comes in. By linking STATS19 data to ambulance data and hospital records, we can get a much richer picture of collisions – where they happen; who is affected and, just as importantly, the full extent of their injuries; how the victims are treated by the health care system; and the outcomes of their treatment. And this information can in turn help answer important questions, like why it is that the reductions in fatalities appear to have stalled, and whether there are practices and interventions that can reduce collisions and increase people’s survival chances.

The potential for the linkage of STATS19, ambulance and hospital data is the basis of an excellent paper by Seema Yalamanchili of Imperial College (PDF download), which in turn was the starting point for a round table I attended in January. The meeting was convened by the RAC Foundation and took place at the Royal Automobile Club. Seema presented her paper, setting out the case for this data linkage, the barriers to linking the STATS19 data – technical, legal and cultural barriers alike – and crucially, laid out a clear plan for addressing these barriers.

The meeting at the RAC Foundation was one of the most constructive, positive meetings that I’ve attended on data sharing and linkage. It was chaired by the RAC Foundation, and included people who produce the official statistics for the Department for Transport and the Department of Health and Social Care; NHS England; policy and scientific leaders from those departments; surgeons who work in trauma care; transport and health researchers; and data governance experts.

A lot of the meetings I’ve attended on data sharing and linkage focused on setting out all the barriers and constraints. And there are indeed a number of challenges. First of all, in any endeavour of this kind, the project should test whether what it is proposing is publicly acceptable. This needs to be done through a process of public involvement that listens to how people feel about linking sensitive pieces of information.

Then there is the legal authorisation – is what is proposed lawful, and who needs to approve it? This element can be complex and time-consuming, as any researcher who has proposed working with healthcare data can attest.

And beyond these ethico-legal considerations, how technically feasible is the linkage? Do the datasets have enough common identifiers for records to be linked with a reasonable degree of confidence? How easy is it to link a record of a road traffic injury to the trauma centre where the patient is treated?

All these issues – public perception, legal context, technical data quality and linkability – are complex in their own right. It can take a lot of time to work through each of them. But underlying these substantive issues lurks a deeper issue: it seems as though the culture of data-owning organisations is not always conducive to data linking. This could be for a range of reasons, including risk aversion or a lack of incentives. Whatever the cause, the result is that organisations are less supportive of data linking than their leaders claim to be.

The RAC Foundation meeting was different from many others I’ve attended on linkage. Led by Seema’s presentation, and drawing on her paper, it focused less on the barriers themselves, and more on what attendees can do collectively to address them. We all focused on what can be done, not what can’t be done. For example, the DfT lead statistician said that, if the linkage took place, he would be keen to include insights from the linked dataset in the annual publication.

The meeting ended with a clear commitment to take the work forward: to enrich the official statistics on road traffic collisions; to link data for more insight into trauma care; and to make a difference to a societal problem that continues to devastate victims and loved ones. Within a couple of weeks of the RAC meeting, a working group involving all the key players has sprung up. All this points to a building momentum for change.

Of course, it may be that there are further challenges ahead. But this project shows that, with creativity, ambition and focus, progress is possible – and that cultural barriers to data linkage are by no means fixed. I hope this approach becomes the norm when people seek to use data to serve the public good.

So, why am I optimistic? Because of initiatives like this.

Next stop: National Statistics status

In 2020 we assessed estimates of station usage produced by the Office of Rail and Road (ORR) and designated them as National Statistics in December. In this blog, Lyndsey Melbourne, Head of Profession for Statistics at ORR, and Anna Price, the lead regulator for the assessment, talk about their experience and why assessments of official statistics are so valuable.

Where it all began…

Lyndsey: Most of our statistics were designated as National Statistics in 2012. In 2019 OSR carried out a compliance check and confirmed they continued to uphold the high standards expected. At the time we also discussed future assessments – in particular our most popular set of statistics, estimates of station usage, had never been assessed. These statistics provide unique information about each of the 2,500+ mainline rail stations in Great Britain. The granularity of data is one of the main reasons that these statistics are of interest to a very broad range to users: they are relevant to anyone no matter where they live. We were keen to further promote the quality and value of the statistics by gaining National Statistics status.

Anna: This assessment was a bit different to others. Usually we do our review, publish our findings and requirements, and then give producers a few months to meet these requirements. But when we first met with Lyndsey and Jay, the lead statistician for estimates of station usage, in April 2020 they told us they were keen to get National Statistics status in time for the next statistical release in December.

The assessment process

Anna: To support ORR to achieve this ambition, we adapted our usual process, for example sharing our findings and requirements as we developed them. This let the statisticians at ORR start on improvements while we worked on more complex findings and wrote our report, instead of waiting until the end. We had lots of meetings with the team during the project and were really impressed with the ideas they came up with each time we raised an area for improvement. I think the flexibility and enthusiasm of both teams was the reason that the project was so successful.

Lyndsey: Throughout the assessment, OSR were flexible and happy to work with us to agree timescales to fit in with our publication plans and around our day jobs. We were keen to work towards achieving National Statistics designation of the statistics in time for our next annual publication planned for December 2020. Otherwise, it would be up to 20 months before we could publish designated statistics!

OSR were very accommodating to this request and we worked closely during the following eight months to review and improve our statistics. OSR’s flexible approach allowed emerging requirements from their assessment to be addressed during the production process of the next set of statistics.

It’s fair to say that producing the annual publication at the same time as addressing OSR requirements was a challenge, but being able to confirm to users that our statistics had been successfully designated as National Statistics on publication day was very satisfying.

The value of this assessment

Lyndsey: During the assessment OSR spoke to a range of users and the feedback they obtained was extremely valuable. We have continued to speak to these users to understand their use of our statistics and how they could be improved further.

The improvement plan we developed to address OSR requirements and other feedback from users was a really useful tool for us. Sharing ideas and drafts with OSR along the way and getting their feedback was another valuable part of the process.  We published this improvement plan on the user engagement page of our data portal to keep users up to date on the changes we were making.

Anna: The users of these statistics are passionate about them. So it was a lot of fun to hear about how they use the statistics, what they like and what would make using them even better. Seeing the variety of people who use the statistics, for a variety of purposes, was really motivating – it made it even more satisfying when we saw changes to the statistics in December which met the user needs we had identified.

At OSR we like to champion good practice, as well as areas for improvement. So it was nice to highlight the great work that ORR were already doing on these statistics – like the Twitter Q&A that ORR host on publication day, this year accompanied by a launch video and a live YouTube Q&A. It’s great to see statisticians putting themselves out there to talk about their statistics directly with users.

 

 

To keep up to date with our latest work, you can follow us on Twitter and sign up to our monthly newsletter.

The people behind the Office for Statistics Regulation in 2020

This year I’ve written 9 blogs, ranging from an exploration of data gaps to a celebration of the armchair epidemiologists. I was thinking of making it to double figures, setting out my reflections across a tumultuous year. And describing my pride in what the Office for Statistics Regulation team has delivered. But, as so often in OSR, the team is way ahead of me. They’ve pulled together their own year-end reflections into a short summary. Their pride in their work, and their commitment to the public good of statistics, really say far more than anything I could write; it’s just a much better summary.

So here it is (merry Christmas)

Ed Humpherson

Donna Livesey – Business Manager

2020 has been a hard year for everyone, with many very personally affected by the pandemic. Moving from a bustling office environment to living and working home alone had the potential to make for a pretty lonely existence, but I’ve been very lucky.

This year has only confirmed what a special group of people I work with in OSR. Everyone has been working very hard but we have taken time to support each other, to continue to work collaboratively to find creative solutions to new challenges, and to generously share our lives, be it our families or our menagerie of pets, all be it virtually.

I am so proud to work with a team that have such a passion for ensuring the public get the statistics and data they need to make sense of the world around them, while showing empathy for the pressures producers of statistics are under at this time.

We all know that the public will continue to look to us beyond the pandemic, as the independent regulator, to ensure statistics honestly and transparently answer the important questions about the longer term impacts on all aspects of our lives, and our childrens’ lives. I know we are all ready for that challenge, as we are all ready for that day when we can all get together in person.

 

Caroline Jones – Statistics Regulator, Health and Social Care Lead

2020 started off under lockdown, with the nation gripped by the COVID-19 pandemic and avidly perusing the daily number of deaths, number of tests, volume of hospitalisations and number of vaccines. This level of anxiety has pushed more people into contacting OSR to ask for better statistics, and it has been a privilege to work at the vanguard of the improvement to the statistics.

To manage the workload, the Health domain met daily with Mary (Deputy Director for Regulation) and Katy, who manages our casework, so we could coordinate the volume of health related casework we were getting in. We felt it important to deal sympathetically with statistic producers, who have been under immense pressure this year, to ensure they changed their outputs to ensure they were producing the best statistics possible. It’s been rewarding to be part of that improvement and change, but we still have a lot of work to do in 2021 to continue to advocate for better social and community care statistics.

 

Leah Skinner – Digital Communications Officer

As a communications professional who loves words, I very often stop and wonder how I ended up working in an environment with so many numbers. But if 2020 has taught me anything, it’s that the communication of those numbers, in a way that the public can understand, is crucial to make sure that the public have trust in statistics.

This has made me reflect on my own work, and I am more determined than ever to make our work, complex as it can be, as accessible and as understandable to our audiences as possible. For me, the highlight of this year has been watching our audience grow as we have improved our Twitter outputs and launched our own website. I really enjoy seeing people who have never reached out to us before contacting us to work with us, whether it be to do with Voluntary Application of the Code, or to highlight casework.

As truly awful as 2020 has been, it is clear now that the public are far more aware of how statistics affect our everyday lives, and this empowers us to ask more questions about the quality and trustworthiness of data and hold organisations to account when the data isn’t good enough.

 

Mark Pont – Assessment Programme Lead

For me, through the challenges of 2020, it’s been great to see the OSR team show itself as a supportive regulator. Of course we’ve made some strong interventions where these have been needed to champion the public good of statistics and data. But much of our influence comes through the support and challenge we offer to statistics producers.

We published some of our findings in the form of rapid regulatory review letters. However, much of our support and challenge was behind the scenes, which is just as valuable.

During the early days of the pandemic we had uncountable chats with teams across the statistical system as they wrestled with how to generate the important insights that many of us needed. All this in the absence of the usual long-standing data sources and while protecting often restricted and vulnerable workforces who were adapting to new ways of working. It was fantastic to walk through those exciting developments with statistical producers, seeing first-hand the rapid exploitation of new data sources.

2021 will still be challenging for many of us. Hopefully many aspects of life will start to return to something closer to what we were used to. But I think the statistical system, including us as regulators, will start 2021 from a much higher base than 2020 and I look forward to seeing many more exciting developments in the world of official statistics.

 

Emily Carless – Statistics Regulator, Children, Education and Skills Lead

2020 has been a challenging year for producers and users of children, education and skills statistics which has had a life changing impact on the people who the statistics are about.  We started the year polishing the report of our review of post-16 education and skills statistics and are finishing it polishing the report of our review of the approach to developing the statistical models designed for awarding grades.  These statistical models had a profound impact on young people’s lives and on public confidence in statistics and statistical models.

As in other domains, statistics have needed to be developed quickly to meet the need for data on the impact of the pandemic on children and the education system, and to inform decisions such as those around re-opening schools. The demand for statistics in this area continues to grow to ensure that the impact of the pandemic on this generation can be fully understood.