covid-19

Which COVID-19 deaths figures should I be looking at?

Every day we see figures for number of COVID-19 deaths in the UK quoted in the media, but what do these mean, and which figures should we pay most attention to?

With the rising death rate, and the complexity and potential confusion surrounding this seemingly straightforward measure of the impact of COVID-19, we are increasingly being asked our view on which data should be regarded as the best measure of COVID-19 deaths.

Of course, whichever way the numbers are presented, each individual death is a sad event. But it is really important to understand the strengths and limitations of the data being considered in order to understand the pandemic and learn from what the UK has experienced.

There are many official sources of data and each has a place in helping understand the impact of COVID-19. Our blog from August goes in to more detail about the difference sources, their uses and limitations. Here we outline some of the key issues to consider when thinking about which figures to use.

What is the difference between figures by date of death and figures based on date reported? Which should I use?

A commonly used headline is the number of deaths reported each day in the UK Government’s coronavirus dashboard, based on deaths which occurred within 28 days of a positive COVID-19 test. This has the advantage of capturing all newly reported deaths each day. It is understandable that this figure makes headlines as it is the timeliest data published, and captures all the additional deaths (within 28 days of a positive COVID-19 test) which government has been made aware of within the previous 24 hour reporting period. However, it has limitations and it is really important that in the reporting of these figures the implications of these limitations are clear.

As well as data by date reported the UK government coronavirus dashboard includes data on deaths within 28 days of a positive COVID-19 test by date of death on the deaths page of the dashboard. These are usually considered to be reasonably complete from about five days after the reference date. Looking at data by date reported shows large fluctuations in numbers, particularly after weekends and bank holidays. Data on date of death will give a better sense of the development of the pandemic and the changing rate of deaths.

This difference between figures for date reported and date of death has been particularly notable in the period following Christmas and New Year given bank holidays and the higher rates of deaths seen over the period. For example, looking at data published on 21 January for deaths within 28 days of a positive COVID-19 test:

Deaths by date of death have a current peak on 12 January with 1,117 deaths (compared with a peak of 1,073 on the 8 April).
Deaths by date reported have a peak of 1,820 deaths on 20 January (compared with 1,224 on 21 April).

Data by date of death should always be used if possible.

How can I best understand if COVID-19 was the cause of death?

The data outlined on the coronavirus dashboard, highlighted above, are based on deaths within 28 days of a positive test. There will be occasions within these cases where an individual had a positive COVID-19 test, but this was unrelated to the subsequent death. There will also be cases where a death was due to COVID-19 but occurred more than 28 days after a positive test result. PHE has published information in a technical note which looks at the impact of the 28 day cut off compared with alternative measures.

A more reliable measure is based on data drawn directly from the system of death registrations and includes data where COVID-19 is mentioned on the death certificate. The Office for National Statistics (ONS) publishes weekly figures, including a UK figure drawing on data from National Records Scotland (NRS) and Northern Ireland Statistics and Research Agency (NISRA).

ONS data are based on information from death certificates and include cases where COVID-19 is likely to have contributed to death (either confirmed or suspected) in the opinion of the certifying doctor. The provisional count is published weekly, 11 days after the end of the time period it covers. These data have many strengths, but provisional figures first published will not capture all deaths due to registration delays.

How can I best understand the impact of the pandemic on deaths?

The measures outlined above all aim to give counts of deaths where COVID-19 infection was a factor in the death. A broader measure which looks at the change in deaths because of the pandemic, whether or not due to a COVID-19 infection, is “excess deaths”. This is the difference between the number of deaths we would expect to have observed and the number of deaths we have seen. This is generally considered to be the best way to estimate the impact of a pandemic or other major event on the death rate.

ONS published a blog alongside its latest publication of excess deaths, which highlights the complexities in this measure. For example, a single figure of how many deaths there have been one year compared with a previous year may not be helpful, due to changes in the population. For this reason, in addition to providing the counts of total deaths, ONS produces estimates for excess deaths in a number of different ways. In its weekly statistics it compares numbers and rates to a five-year average, so that is comparing a similar period in terms of life expectancy, advances in healthcare, population size and shape. It also publishes Age Standardised Mortality Rates for England and Wales so that rates taking into account changes to the population size and structure can be compared.

The challenges of counting COVID deaths

During the coronavirus pandemic a key question has been: How many people have died because of COVID-19? This seemingly straightforward question is surprisingly difficult to answer.

The complexity lies partly in the different ways this can be measured. Is it about how many people died because of a COVID-19 infection? Or, how many more deaths there have been because of the pandemic, whether a direct or indirect result of COVID-19 (‘excess deaths’)?

Even when the question is clear, the variety of data published by different organisations can mean it is hard to know what the right answer is. The official published data cover varying sources, definitions and methodologies. Factors leading to differences in published death figures are set out but the amount each factor contributes to the differences does not appear to be fully understood and needs to be more clearly explained.

Each of the sources support different purposes. Greater clarity on these purposes would support a better understanding of the data and improve confidence in the estimates produced by government.

What data are available?

The Office for National Statistics (ONS) has published a summary of data sources available across the UK. This provides a good summary of the range of data available and Section 7 sets out useful table showing how the sources differ. However, the article does not make any judgement on the impact of these differences or the best source of data to use in specific circumstances.

Estimates of ‘excess deaths’ are the difference between the number of deaths we would expect to have observed at this time of year and the number of deaths we have seen. This is generally considered to be the best way to estimate the impact of a pandemic on the death rate. ONS and Public Health England (PHE) have published estimates. The most recent ONS publication has been clearly explained and provides comparisons across 29 countries, with information published at UK, country and local authority levels. The methodology published alongside the PHE report explains how PHE draw on data from ONS to produce its estimates.

There are estimates for the number of people who have died as result of a COVID-19 infection, a really important factor in understanding COVID-19 and the development of the pandemic. For England, there are three main sources of COVID-19 daily deaths data. These are:

Office for National Statistics (ONS) Weekly deaths: The provisional count is published weekly, 11 days after the end of the time period it covers. Figures are drawn directly from the system of death registration and include all deaths where COVID-19 is mentioned on the death certificate. These figures cover all settings and include a breakdown by setting. Counts are published for date of death and date of registration.
Public Health England (PHE) Surveillance data: Published daily, these estimates cover deaths in all settings, by date of reporting or date of death, for any individual with a positive COVID-19 test result. There is currently no cut off for the date of the positive test relative to the date of death.
NHS England (NHSE) Hospital deaths: Published daily, these figures cover hospital deaths with a COVID-19 positive test. Since 24 April figures are also published for instances where COVID-19 is referenced on the death certificate, but no positive COVID-19 test result was received. Since 19 June, if a death occurs more than 28 days after a positive test then it is not be included in the headline series (though would still appear in the figures for COVID-19 mentions on a death certificate with no positive COVID result).

In all three sources, the organisations make information available on when deaths have been reported or registered as well as the date the death occurred. The data from these three sources relating to date of death is considered through the rest of this blog as this is the most informative headline measure and the measure which is more directly comparable between the three sources. The date on which a death is reported or registered can vary for a number of reasons, generally linked to administrative processes, and therefore leads to a more volatile series. While this registration information has value, the uses and limitations of these data should be clearer. The date of death should be used as the headline measure for understanding when deaths occurred.

How much do the sources vary?

There are valid reasons for differences in the figures for number of COVID-19 deaths published from each of the three sources outlined above. Each source is published in order to meet a different purpose and therefore has value in its own right. However, the purpose of each source and what it seeks to measure is not always clear. For example, more timely data from PHE offers a leading indicator of the current development of the pandemic, while the ONS counts offer a more reliable indicator in a slightly slower timeframe.

While differences will always occur, it is really important that the reason for these differences is understood and well explained. This assures those using the data that they are using the most robust data for their purpose that in turn supports better informed decisions. The triangulation of data between sources can offer an important part of quality assurance and may support methodological improvements over time.

When looking at the data based on date of death for all three sources, the trends are broadly consistent over the period of the coronavirus pandemic. The charts below show the data for date of death from the three sources for England, up to 24 July 2020.

Figure 1: Cumulative deaths by date of death up to 24 July 2020

Sources: ONS Weekly Deaths COVID-19 – England Comparisons (NHSE deaths published by 2 August and ONS deaths registered by 1 August) and PHE England Deaths by Date of Death (5 August download).

Figure 2: Deaths by date of death up to 24 July 2020

Sources: ONS Weekly Deaths COVID-19 – England Comparisons (NHSE deaths published by 2 August and ONS deaths registered by 1 August) and PHE England Deaths by Date of Death (5 August download).

While the overall trends shown by the data follow a similar trajectory, it is notable that the relative positions of the trend lines change. For much of the pandemic the ONS daily estimates of deaths have been higher than the PHE daily estimates. Since the last week of May, the PHE daily estimates are generally higher than the ONS estimates for the equivalent dates.

More recent data shows greater volatility, as expected given the lower numbers of deaths observed. Relatively small differences (in numerical terms) have a greater impact on percentage differences. Figure 3 illustrates the difference between PHE and ONS figures over the most recent month for which both data are available. The gap between the two sources in numerical terms is volatile, but broadly consistent over this period. However, because of the reducing number of deaths, the percentage difference is increasing (though variable) over time. It is likely the ONS provisional counts will be revised up over time, but this is unlikely to close the observed gap fully.

Figure 3: Deaths by Date of Death 18 June 2020 to 24 July 2020

Sources: ONS Weekly Deaths COVID-19 – England Comparisons (NHSE deaths published by 2 August and ONS deaths registered by 1 August) and PHE England Deaths by Date of Death (5 August download).

Another way to corroborate the data between sources is to consider the NHSE data compared to the ONS data on place of occurrence (e.g. hospital, care home etc). ONS publishes data on place of occurrence by date of death in the local authority tables, including COVID-19 breakdowns. The trends look broadly consistent (see Figure 4). The overall number of deaths recorded in hospitals is similar for both sources By 24 July ONS reported a total of 31,022 deaths in hospitals and NHSE figures showed 29,303 deaths in hospitals (reported by 2 August) for those with a positive test, a difference of five per cent. If the NHSE figures for those with COVID-19 mentioned on the death certificate but no positive test are also included, then the cumulative totals from the two sources are even closer.

Figure 4: Hospital deaths by date of death (week end date)

Sources: NHSE data from ONS Weekly Deaths COVID-19 – England Comparisons (published 2 August) and ONS Death Registrations by Local Authority (4 August).

Why does this variation occur?

There are many possible explanations for the observed differences and some estimates of the scale of impact has been made, but it is not yet clear what the dominant factor is. Some of the issues that contribute to differences are outlined below. Producers of these statistics could seek to better explain the impact of the differences and support a clearer overall narrative.

Positive tests compared with death registrations

The most significant difference in published figures is likely to relate to whether the data are based on positive COVID-19 tests or information on death certificates. ONS data are based on information from death certificates and include cases where COVID-19 is likely to have contributed to death (either confirmed or suspected) in the opinion of the certifying doctor. PHE data covers all deaths where the individual has had a positive COVID-19 test at some point in the past. NHSE data covers cases with positive test results (since 19 June the positive test result must have been in the 28 days prior to the death) and since April also separately publish information on death certificate mentions of COVID-19 with no positive test result.

The impact of these differences in approach is unclear. For example, PHE data will include some cases where an individual had a positive test result, but the death was not because of COVID-19. There will also be cases where a death is due to COVID-19, but no test had been conducted – these cases would not appear in the PHE data. It is likely the balance of these two factors has changed over the course of the pandemic, as testing has become more widespread. For the earlier time periods, PHE’s approach may have underestimated the number of deaths from COVID-19 (primarily because lower numbers were tested). More recently, PHE data may be overestimating deaths from COVID-19 because the approach is picking up more people who had died from other causes, but had tested positive for COVID-19 at some stage (either because the COVID-19 was mild and not the cause of death or because the individual had recovered from COVID-19 before the death occurred).

Comparison of ONS and PHE data at the level of individuals should help understand the impact of this issue. However, early in the pandemic it is also possible that measurement based on death certificates underestimated COVID-19 related deaths, possibly because of a more limited awareness of the virus at that stage and the impact of this is likely to remain hard to measure.

Positive test cut offs and death registration time lags

Timing differences will impact on the estimates.

NHSE have introduced a 28-day cut off between positive tests and date of death. This is an approach also taken in some other countries. However, the impact of this cut off and whether it is appropriate is currently unclear. It is likely that introducing a cut off for the PHE data would reduce the estimates a little but would not bring them down to the level of the ONS estimates. PHE’s work to look at the validity and implications of a cut off of different lengths is really important. The impact of having a cut off or not will become more marked in later stages of the pandemic, for example, because as more time passes, the likelihood of death occurring for individuals who were tested more than 28 days earlier is increasing.

ONS data are based on COVID-19 being mentioned on the death certificate (suspected or confirmed). This approach has many strengths. However, the provisional figures first published will not capture all deaths due to registration delays. ONS is clear about this limitation and publishes details of the impact of registration delays on mortality statistics. However, there is not currently an assessment specific to COVID-19 and given the unprecedented circumstances it is hard to predict the scale of this issue based on past revisions. For example, the impact of deaths which have been referred to a coroner is currently unknown and could lead to an undercount as those deaths may not be formally registered at this stage. In general, most of the impact of revisions is seen in the first few weeks after initial publication.

Methods of compilation

Each of the organisations get data from different sources and have different approaches to production of the estimates. The impact of these differences is not well explained.

NHS England data are based on deaths which occur in hospital. They form one input into the PHE data. It would be expected that the NHSE data as a proportion of PHE data would be broadly similar to the proportion of hospital deaths seen in ONS data. This is not currently the case. While some of this could be down to definitions (for example use of the 28-day cut off by NHSE) it is likely that there are other factors contributing to this difference. NHSE data are taken from submitted hospital returns and rely on the hospital matching a positive test with a patient. PHE data are drawn from multiple sources which need to be cleaned and matched to deliver PHE estimates of deaths. This is a complex process. It is possible that through this process some hospital deaths are picked up by PHE which have not been included in the NHSE data, but there may be other unknown factors contributing. Further work to understand what drives the differences between the two sources would give greater confidence in the data.

What needs to change?

It is positive to see that organisations are trying to better understand the issues associated with these data and why these differences occur. The analysis ONS and PHE are undertaking to look at differences between sources should offer a valuable insight into what is driving the differences and whether there are any changes needed in the production or interpretation of any of these statistics.

It is critical that there is greater transparency about how estimates are produced and what is driving the different numbers being published. Statisticians across relevant bodies must take a lead in understanding these data and communicating weaknesses, limitations and a coherent narrative. This will improve confidence in the data and decisions made based on these data.

Getting the data to support decisions on adult social care

The Coronavirus (COVID-19) pandemic has put the adult social care sector under the spotlight and the Office for National Statistics has responded to the demand for trustworthy, high quality insight on the impacts of COVID-19 by providing analysis using new data sources. To further improve data sharing and fill gaps in evidence for this sector, the ONS is introducing steps to improve social care statistics. In this guest blog, Sophie John explains more.

Recently the Office for Statistics Regulation (OSR) published papers outlining gaps in evidence in social care across the four nations of the UK. Then the COVID-19 pandemic hit, which had a significant impact on care home residents and recipients of domiciliary care. The impact of the pandemic highlighted, in line with the OSR report, the need for increased information on adult social care as the demands on services continue to rise.

The OSR report also highlighted that, historically, social care has not been measured with the same depth of data and analysis as healthcare due to a scarcity of funding. This is problematic for researchers, academics and policy makers who require sufficient evidence upon which to make informed decisions.

Today, we have made our first step towards making improvements to accessibility by releasing a new interactive tool where users can easily explore the landscape of adult social care data in one place.

The new interactive tool compiles official statistics relating to adult social care across England, Scotland, Wales and Northern Ireland and each month it will be updated with new publications for users to browse, including fresh insight on the effect of the COVID-19 on the care sector.

Next steps…

While we have met one of the OSR goals of improving the accessibility of official social care statistics this is only the first step of many required to continue to improve evidence around this sector.

The OSR report also highlights the need for improved leadership and collaboration. Our aim is to engage with stakeholders across the four nations, whilst working with the Government Statistical Service Harmonisation team, to help make statistics more comparable, consistent and coherent. Engaging with organisations working in similar areas, we will endeavour to ensure that work is joined up and well informed by other experts.

Further, we are working to identify the gaps in evidence in adult social care data. Areas of interest include investigating data availability on unpaid carers and self-funders to seek to improve knowledge of individual care journeys and outcomes.

Following our releases on Deaths involving COVID-19 in the care sector, England and Wales we plan to produce a new annual release for ONS reporting deaths in care home residents which will be released later this year. This will help us understand more about the causes of death in care home residents, including characteristics, to inform policy.

The ONS, working with partners across the sector can play an important leadership and coordination role within adult social care statistics and our interactive landscape tool is our first step towards achieving this.

This is a guest blog from Sophie John (Head of Adult Social Care Analysis, ONS)

Rising to the challenge

One thing that has stood out in the Coronavirus pandemic is how quickly the Government Statistical System (GSS) has responded to the need for data to support important decisions made by the government during this time. In just a matter of weeks, statisticians have repurposed existing statistics and have made use of new data sources. Before the crisis, this work might have taken months. HMRC’s Coronavirus Job Retention Scheme (CJRS) statistics and its Self-Employed Income Support Scheme statistics (SEISS) are among these. We recently conducted a rapid review of these statistics and Ed Humpherson, Director General for Regulation, has written to HMRC’s Head of Profession for Statistics, Sean Whellams, supporting the approach taken to produce these statistics.

In March this year, we wrote about how our assessment of employment and jobs statistics in effect captured the statistical world in miniature, a microcosm, of the statistical system. The assessment surfaced many of the common issues that statistics producers face, highlighting recurring themes from our other regulatory work. Now we see a further glimpse of our statistical world in miniature, through the lens of our recent review of HMRC’s statistics. HMRC’s response to the need for information about these crucial schemes admirably demonstrates government statisticians rising to the key challenge of the times.

There are two aspects of the statistical system which these statistics exemplify.

First, for us as statistical regulators, whether during a national crisis or otherwise, government statistics should (i) answer society’s questions in a timely way; and (ii) provide insights at a level of detail to be useful to users. Additionally, many questions cannot be answered without sharing and linking data. In the preparation, production and publishing of the CJRS and SEISS statistics, HMRC has displayed all of these elements.

Naturally society’s questions have been about what the take-up of the schemes and the costs of the job protection schemes to the public purse. These interests reflect two essential aspects of the Government’s job protection schemes – speed and simplicity of support. CJRS was launched on 20 April and SEISS on 13 May. Initially HMRC tweeted daily information on both schemes – in CJRS information about the number of unique applicants and the number of furloughed jobs. In SEISS, HMRC gave information about the total number of claims. HMRC tweeted also the value of the claims received in respect to both schemes. As claims to the schemes started to tail off HMRC moved to tweeting the data on a weekly basis. Releasing these statistics on a timely basis and at intervals that met the needs of users is a good example of the orderly release of statistics essential to building trust in new statistics.

After just a few weeks of tweeting this important data, HMRC linked both the CJRS and the SEISS data with other pre-existing HMRC administrative data to provide further insights into CJRS claims by employer size (for SEISS a breakdown of claims by age and gender), and breakdowns for both schemes by sector of the economy, and by geography. These new statistical breakdowns were published in statistical bulletins released less than two months after the launch of the CJRS and under a month after the launch of SEISS – quite remarkable achievements.

Second, we found HMRC to be working closely with users of this data to find out what they need to know from the statistics. HMRC is open with users about any aspects of uncertainty in its estimates labelling the statistics and analysis with a frank description of what the statistics summarise. Consistent and coherent standard geographic definitions are adopted to harmonise with related statistics and data. In explaining the methods used to produce the statistics, HMRC has been proportionate to the complexity of the methods themselves, reflecting the needs of different types of users and uses.

In a further example of a statistical system working well, we as statistical regulators look for statistical producers to go beyond making statistics and data available fast but to also present clear, meaningful, and authoritative insights that serve the public good. There are many examples of how HMRC has done this but to select just one in the SEISS bulletin HMRC set out the numbers of self-employed people across the UK who are eligible for support. This information is published not only at the UK level but also in accompanying tables at regional and sub-regional (local authority and parliamentary constituency) levels. As we pointed out in our Monitoring Review of Income and Earnings in 2015, timely data on self-employment income is a key data gap generally. These statistics help understand just a little bit more about the income and earnings of the self-employed at different locations around the UK in 2020 and are a step towards addressing this data gap.

Normally when statistics producers publish new or innovative statistics, they have a period in which they can develop and improve the statistics. HMRC will only have a restricted period in which to do this for its CJRS and SEISS statistics. By their nature, these schemes are temporary and we will probably only see a small number of releases. Quite how the statistics may change, for example with the advent of the Coronavirus Job Retention Bonus Scheme, is yet to be established. Society will pose further questions for statistics producers to answer.

The times we live in call for an ongoing watchfulness, a continuing need to be agile and responsive to offer even further insight in the months and years going forward. What we found in this case study is replicated throughout the statistical system. With continuing watchfulness there’s little doubt that there will be further statistics, data and insights to help manage to the other side of this pandemic.

Think of the children

The Covid-19 pandemic is having a profound impact on all parts of society. While statistically those medically hardest hit by the disease are the older generations, children and young people are having to come to terms with significant, immediate and possibly long term changes to their lives.

More than ever it is important that statistics about children and young people reflect the lived experiences of children. Statisticians play a key role in ensuring that the data collected and published about children and young people accurately reflects their needs and helps to inform policy and services that work to support them.

Our review

Prior to the pandemic we started reviewing the availability of statistics about children and young people with a view to better understanding their value in society and to determine whether:

the current statistics are accessible, timely and help society to understand the experiences of children and young people in all aspects of their lives
improvements are needed to the ways in which decisions on what to collect and analyse are reached

the wider statistical system is responsive to the needs of users of statistics.

We want to see a step change in how the needs of children and young people are met by official statistics, where statistics producers consistently consider children and young people’s needs and voice during the design, collection, analysis and dissemination of statistics. The current pandemic and its aftermath make this all the more important.

Our initial research has looked at the strengths and weakness of the current statistics on children and young people. In doing so, we have identified three key lenses which, if applied through a structured framework, may support statistic producers to better meet users needs. This approach reflects the core principles set out in the UN Convention on the Rights of the Child.

Our proposed framework

We propose that producers of statistics consider children and young people through three lenses.

Visibility – Statistics are available on children and young people
Vulnerability – The experiences of vulnerable children can be analysed separately
Voice – Statistics reflect the views of children and young people and can be used by them

For each lens of the framework we propose some key questions for statistics producers to consider.

Visibility – Statistics are available on children and young people

Are children and young people visible in the statistics?
Is data collected about them and then made available to inform decisions in the best interests of the child?
Are decisions around what data to collect on and from children and young people transparent?

Vulnerability – The experiences of vulnerable children can be analysed separately

Are the most vulnerable children visible?
Is their experience identifiable to ensure that they are not being discriminated against?
Do the statistics and data help identify which groups of children and young people are the most vulnerable to having poorer outcomes?

Voice – Statistics reflect the views of children and young people and can be used by them

Are the views of children and young people represented in the statistics?
Are survey questions asked to children and young people themselves?
Do the statistics give them a voice on what is important to them by being understandable to them?

Your views are important to us

The next stage of our review is to test this framework approach with a wider set of users and statistics producers to see if this supports these aspirations. We hope also that sharing our initial thinking now may assist producers in their immediate decisions about what statistics and data they should be collecting and making available during and after the Covid-19 pandemic.

Are you a statistician trying to identify what data to collect and publish? Would this framework help you in making those decisions? Is there anything else that you feel could be considered? What would be the barriers to ensuring that children and young people are visible in the statistics, that the vulnerable can be separately analysed and that the statistics give children and young people a voice?

Are you a decision or policy maker using statistics to understand the lives of children and young people and the impact of decisions and policies on them? Does this framework cover the key elements that you feel are important? Is there anything else that you think statisticians should consider?

Are you a researcher using data and statistics to research children and young people’s lives and outcomes and the interventions that impact on them? Does this framework cover the key elements that you feel are important? Is there anything else that you think statisticians should consider? Are your needs adequately reflected by the framework?

Are you a child or young person or do you represent them? Are visibility, vulnerability and voice the key elements of statistics that are important to you? What are you most interested in when looking for statistics? What makes it difficult for you to find and use statistics?

Please get in touch to share your thoughts with us at regulation@statistics.gov.uk

Closing data gaps: understanding the impact of Covid-19 on income

In recent weeks, you may have spoken with friends and family who’ve seen their income and living standards impacted in some way by COVID-19. They may have been furloughed and are concerned about whether they will have a job to return to or perhaps they have experienced a reduction in business if they are self-employed. Maybe your own household is receiving less income and you are struggling to juggle household costs with home schooling.

Despite the UK starting to ease the lockdown measures it introduced in response to COVID-19, the impact of this pandemic on the labour market and people’s livelihoods is expected to continue for some time. We are already seeing signs of the scale of the impact on the labour market; from vacancies at a record low in May to new claims to Universal Credit passing 2.5 million between March and June. The Office for National Statistics (ONS) recently brought forward the launch of its online Labour Market Survey to help provide the necessary insight into the impact of COVID-19 on people’s employment and working patterns.

There is a range of data which can help us understand how jobs and employment have been affected but we need better data on income and earnings to fully understand the narrative of how people’s livelihoods and living standards are being affected by the pandemic. A recent Opinions and Lifestyle Survey by the ONS found that half of the self-employed reported a loss of household income, compared with 22% of employees, in the month of April. Last year, we wrote to the ONS, Department for Work and Pensions and HMRC to restate the importance of delivering the insights identified in our work on the Coherence and Accessibility of Official Statistics on Income and Earnings. Whilst some progress has been made since our findings were published in 2014, it has been slow to date and more work needs to be done to help users understand the dynamics of the labour market and to address key data gaps in relation to income and earnings.

We have recently carried out work to look at examples of data gaps being addressed in the statistical system. Our work found three common themes in successful cases of solving data gaps: sufficient resource (whether new or restructured), high user demand and strong statistical leadership. The combination of new user demand for information on income and earnings that has emerged from COVID-19, restructured resource that has been put in place to respond to this demand, and the potential for statistical leadership to shine, could be the catalyst for solving these data gaps.

Improving the storytelling of income and earnings and addressing the data gaps identified by OSR could help users better understand the lived experience of households and different employment types throughout the pandemic. These are difficult times for many people from all walks of life and people are facing lots of unknowns. It is important that we can understand the true scale of the impact so that when the UK begins its recovery from the pandemic, support can be targeted effectively towards the groups most severely affected. There are two areas in particular in which solving data gaps could improve our understanding of COVID-19.

Household level data is not keeping pace with individuals

Household measures of income and earnings have traditionally been less timely than measures for individuals and this formed a key area of our findings in the work highlighted above. With respect to COVID-19, there is interest in understanding how the Government’s income support measures have impacted income for different household types such as those with children or lone parent households. Even in households which are not receiving any income support, people may have had to adapt their working patterns to share the responsibility of childcare which may lead to one or both of the earners in a household working reduced hours on potentially reduced pay. HMRC has published data which shows that 9.1 million jobs had been furloughed by mid-June but we won’t see any contextual data about the impact on households until 2022 in the Family Resources Survey. We hope the relevant statistical teams explore new ways to deliver this insight in the meantime.

There are lots we don’t know about the world of the self-employed and business owners

It is notoriously difficult to capture information on the income and earnings of the self-employed or those who own businesses. This is because many earn less that the taxable allowance so are not captured in statistics relating to income tax and many don’t have predictable earnings so we don’t know what they’ll earn until well after the year end. The surveys which do manage to collect information on the self-employed are less timely than those for employees. When the Chancellor announced the Self-Employment Income Support Scheme, it quickly emerged that more people would need the support than originally anticipated and that the eligibility criteria would need to be adjusted to reflect the various ways that the self-employed can pay themselves. Improving the timeliness and completeness of information on the income of the self-employed could help identify groups of individuals who currently fall through the gaps of eligibility for the income support schemes in place.

Covid-19: The amazing things that statisticians are doing

Stories of extraordinary human feats abound in this pandemic. They include the efforts of health and care professionals and personal commitments to support others in the community, perhaps best shown by Captain Tom Moore.

Statisticians are not on the front line of dealing with the impacts of Covid-19. Yet it is clear that one of the battlefields on which the fight against the pandemic is being fought is a statistical one. Slowly and painfully data about the virus and its behaviour are accumulating, and, sometimes working through the night, statisticians are making sense of that data. Creating models for what would happen under different policies, statisticians have provided real-time insight to political decision makers on the pandemic and its social and economic impacts.

More importantly, the progress of the pandemic has been communicated to the public through data and statistics. The value of trustworthy information is emerging as one of the stories of this convulsive experience.

Statisticians in the health sector have built dashboards for the UK, England, Scotland, Wales and Northern Ireland to provide daily updates to the public. Their colleagues who work on population statistics have provided weekly updates on deaths, which represent the most complete measure of the mortality impact of Covid-19 (published for England and Wales, Scotland and Northern Ireland). These weekly statistics have developed at an unprecedented pace to provide more detailed insight, for example on deaths in care homes.

Beyond that, statisticians at the Office for National Statistics (ONS) have produced rapid insights into the population’s behavioural responses and into the impact on the economy (through its faster economic indicators weekly publication). ONS has also published in-depth analysis, such as its striking findings on the relationship between mortality and deprivation.

Similar efforts are being made by statisticians in other Government departments across the UK, highlighting impacts on areas like transport, and education in England and Wales. These outputs require new, often daily, data collection, and would have seemed incredibly radical only a couple of months ago. And researchers outside Government have also worked at amazing speed, using data published by Government statisticians to highlight emerging issues within weeks – for example the Institute for Fiscal Studies research on ethnicity.

Perhaps most impressive, ONS is now in the field with a household survey that tests for whether people have had the virus already. This testing holds one of the keys to understanding the pandemic. The ONS, working with partners at the Department of Health and Social Care, the University of Oxford and IQVIA, has used its expertise in household surveys to develop the survey.

What have we been doing at the Office for Statistics Regulation? We set out our aims here: we committed to support producers of statistics as they provide the best possible information to the public. We have:

granted a number of exemptions to the Code of Practice so that producers can reach audiences effectively;
conducted a series of reviews to provide endorsements to the approach adopted for new outputs;
held discussions on the core Covid-19 data: the daily dashboards and weekly deaths. We have particularly focused on making sure the differences between the two are clear: the daily dashboards provide a leading indicator, while the weekly deaths provide a more complete picture, albeit with a time lag. There is still a need to provide a coherent overview, though, and at OSR we will continue to press for improvements in coherence.

And we have also maintained our commitment to standing up for need to publish data. We have written to the Department for Work and Pensions (DWP) about publishing information on Universal Credit claims in the pandemic. And we have written to the Department of Health in Northern Ireland (DoHNI) calling for the resumption of daily dashboards. In both cases the Departments responded well. DWP has committed to publish on Universal Credit, and DoH in Northern Ireland has resumed the daily dashboard.

The efforts of statisticians are in some ways quieter, and less visible, than the work of health and care professionals and people in the food and retail sectors. But the work of statisticians to inform the public is crucial. I hope this blog represents a quiet form of celebration.