Listen to your enthusiasts: Implementing RAP at Public Health Scotland

This is a guest blog from Scott Heald following the launch of our new report: Reproducible Analytical Pipelines (RAP): Overcoming barriers to adoption.

Firstly, let me introduce myself. I’m Scott, the Head of Profession for Statistics at Public Health Scotland (PHS). PHS is a new body, formed in April 2020 (to lead on tackling the challenges of Scotland’s public health but dominated by the COVID-19 pandemic in our first year). Our RAP journey started when we were in NHS Scotland’s Information Services Division (ISD), the health and care statistics body which now forms part of PHS.

PHS, and ISD before it, has been a big fan of RAP from the beginning. I wanted to share our story, from a bunch of enthusiastic statisticians who convinced me it was the right thing to do (I didn’t need much convincing!), to embedding it within our organisation in our reporting on COVID-19.

Our RAP journey began with a programme of work to transform how we published our statistics.  It quickly became clear that the programme had to be as much about our processes for producing statistics, not just the final published output. More automation was key – to speed up processes, eliminate manual errors, and to release capacity to add value to our statistics. Greater job satisfaction for our statisticians was a welcome impact too.

“We don’t use that here”

Up till this point our software of choice was propriety standards. More and more graduates were joining our organisation and wondering why we weren’t using open-source software like R, having been taught it at university. I guess, in those early days, I was probably part of the “we don’t use that here” (partly out of fear as I had not personally used any of the new software they were talking about).

However, I was persuaded (and willing) for a group of our statisticians to show us what could be done using R. Long story, cut short, is that our group of enthusiasts showed the power of what could be done and PHS is now making the strategic shift to follow RAP principles as a way of working.

The art of persuasion

I’d describe our RAP journey as “bottom up”, with support from the “top down”. When we started seeing the results, we didn’t need much convincing. OSR’s report showcases our work on Hospital Standardised Mortality Ratios – a quarterly publication which used to take five days to run (lots of detached processes, lots of room for error). I remember vividly the first time the team ran it the RAP way. Five minutes after the process started, the finished report was in my inbox. We couldn’t believe it! And, to be sure, spent the next five days running it the old way to make sure we got the same answers (we more or less did; the RAP way was more accurate, highlighting a few errors we fixed along the way!).

Our learning is that it’s a relatively easy shift for more recent graduates because they already know R. The focus for our training had to be on members of staff who have been with us for longer and weren’t familiar with it. And that can take some persuading – team leaders finding themselves managing teams who are using software they have never used themselves. We had to support and train at all levels (with tasters for managers who themselves may not need to delve into the finer details of R, but know they would know how to do it if they were doing their time again).

So, what have we learnt?

  1. Be prepared to try new ways of working
  2. Listen to your staff, who have a different perspective and fresh take on ways of working
  3. Start small – it’s easy to make the case when you can showcase the benefits of the RAP approach
  4. RAP improves the quality of what we do and eliminates errors
  5. Be prepared to invest in training – and recognise your staff will be in different places
  6. Use buddies – our central transformation team certainly helped with that, creating capacity in teams to RAP their processes
  7. Be open and share your code – we publish our code on GitHub, a great community to share ideas and approaches

Listen to your enthusiasts

OSR’s report highlights that we have a small central transformation team to support teams with their RAP work. This is crucially important as the initial work to RAP your processes can take time, so the additional capacity to support our teams to enable this to happen is a must. This initial investment is worth it for the longer-term gains. It’s not all about making efficiencies either. It’s about streamlining processes, reducing error, and giving our analysts job satisfaction. They are now able to add more value because they have more time to delve into the data and help users understand what the data are telling them.

Our focus on transforming the processes for our existing publications has stalled due to many of our staff being redirected to supporting PHS’s key role in Scotland’s response to the COVID-19 pandemic. However, a success of our RAP work is that many of our new processes required to produce our daily COVID-19 statistics are done using RAP principles – they had to be as we’re producing more statistics, more frequently than ever before. Our earlier use of RAP meant we were in a good place to apply the techniques to our new ways of working.

And my final bit of advice? Listen to your enthusiasts – I’m glad I did.

Now, about that Excel spreadsheet…

I wouldn’t blame you if you were scratching your head at the outrage expressed last week that Excel was being used to record the information on COVID-19 test results in England. After all, it’s the most used spreadsheet tool in the world. It’s also a computer programme which, along with other proprietary software, has been used in public sector analysis for decades.

The reason for all this concern is that it’s easy to make mistakes with Excel – like referencing the wrong cell in a calculation (we’ve all done it). And once you’ve made the mistake, it’s hard to find it. It’s not clear who has been using your spreadsheet and changed it (or, even worse, whether Excel has taken it upon itself to change it for you). This might not matter if your spreadsheet is for holiday planning or your personal budget (yes, we’re those kind of nerds). It definitely does matter when your spreadsheet is used by multiple people to produce and present official statistics, and what’s more – there is a better way.

Many statisticians and analysts are now starting to think differently and move away from off-the-shelf software with the aim of solving these problems. Within the Government Statistical Service this approach is known as a Reproducible Analytical Pipeline (also fondly referred to as RAP). It’s sometimes mis-characterised as simply automation, but it is so much more than that.

So…What is RAP?

RAP is a set of good practices and principles. RAP requires built-in checks and ensures a guaranteed audit trail of changes using version control software like git (which comes in handy if something goes wrong and you need to roll back a version!). It champions working in the open, through the publication and peer review of code on sharing and version control platforms such as GitHub. This allows collaboration, reuse of code by others and improves trust from users. RAP also enshrines good practice, such as well-commented and documented code, or appropriately stored and structured data. These good practices help prevent all sorts of issues from creeping in – like the flow of data being disrupted as a result of processes that are easily manually manipulated.

The end result is a higher quality, more transparent and more efficient process, allowing more time for statisticians to use their skills to add insight and value to their outputs.

RAP to the Future

At the Office for Statistics Regulation (OSR), we see the incredible progress that has been made by official statistics producers to RAP their work. But this progress appears in pockets and there is still a way to go to make sure that RAP is not only the default approach, but that all of its elements are applied. We know that barriers to RAP exist, whether it’s access to the right tools and training or the time and support to carry out the upfront work required. This is why at OSR we have launched a review to explore the use of RAP across government statistics in more detail. We want to better understand what enables successful implementation of RAP and what prevents people either implementing RAP fully or applying elements of it. If we understand these barriers then we can do more to help resolve them and ultimately the quality, trustworthiness and value of official statistics will improve.

Now, about that Excel spreadsheet…

COVID-19 has challenged statistics producers in a way that has never been seen before and they should be proud of the way they have risen to this challenge. Statistics were (and are being) produced from scratch and at record pace to inform both government and the public during these unprecedented times and this contribution should be celebrated. While the error with the Excel spreadsheet was not directly part of official statistics production there are still lessons we can learn from it and it highlights some important questions such as:

  • what tools and support were available to producers when they needed it most?
  • was RAP the approach taken to setting up this new work? If not, why not?
  • and how can the good practices of RAP be effectively implemented when time is short and the pressure is high?

Although our review does not focus only on COVID-19 statistics, these are the sort of questions we want to explore in order to help statistics producers on their RAP journey. If you have experience with this, or any other RAP process, please contact us at Anna.Price@Statistics.gov.uk or Emily.Tew@statistics.gov.uk – we’d love to hear your views.

Because while we can’t fix the past, we should RAP the future.