Investment and improvement
Investment in a longer-term solution for the LCF is urgently needed
The LCF statistics team has not been allocated the additional funding that was necessary to address its concerns about the sustainability of LCF in its current form. As part of bid for investment under ONS’s spending review in 2019, the statistics team highlighted the inefficiency of the data processing systems as a limitation to the development of the quality and accuracy of LCF. The 2019 bid was built around the need to update the systems to enable implementation of COICOP 2018 and included implementation of machine learning methods. The spending review bid for 2020 was broader and included an option to develop a new expenditure survey and to deliver a short term boost to the LCF. The bids for investment in 2019 and 2020 were unsuccessful.
As part of the team’s engagement with the Eurostat Innovative Tools and Sources Taskforce, the team received funding through a Eurostat Grant project to commission a proof of concept looking at using machine learning to automatically categorise receipt information. This work was taken forward by the Data Science Campus. This work did not gain momentum due to the COVID-19 pandemic and further funding was required to progress the project from the proof of concept stage.
ONS is currently working on transforming its data on the distribution of household finances, as part of its wider transformation programme. The LCF fits within the Household Finance Survey (HFS) model and transformation plans, which has involved harmonising questionnaire content in the LCF and the Survey of Living Conditions to create a household income dataset with a larger combined sample. ONS has invested heavily in developing the HFS. As we highlighted in our review of income-based poverty statistics, the development of HFS provides an opportunity for ONS and the Department for Work and Pensions to explore the feasibility of consolidating their existing surveys to create a single data source on household incomes.
As the focus for transforming expenditure data has been through HFS, the transformation of specific inputs into HFS has been of a lower priority. A large-scale transformation of the LCF would be difficult due to the wide range of stakeholders and dependencies it has. However, the LCF is unique and needs investment in its own right, regardless of the wider transformation work. As well as developing alternative approaches to collecting data, ONS needs to make more-urgent improvements to specific parts of the LCF.
As well as the creation of HFS, ONS has been pushing for greater use of administrative data and has considered the use of credit card and scanner data to enhance the survey data. These types of data were used by ONS in the pandemic, including as part of its faster indicator series, when the LCF was not available for a short period. The statistics team is also considering the use of loyalty card data. The statistics team told us that the University of Bristol is carrying out work to explore the willingness of individuals and organisations to let research agencies have access to these data sources.
Our review of international activities revealed a high level of co-operation between National Statistical Institutes in the development of their household surveys, including for example, co-operation between the US and Canada on recall periods (see annex). It is worth noting that several countries including Ireland, Canada and the Netherlands have worked closely with UK staff on the implementation of commercial recognition software, which has yet to be used by the ONS. These countries’ engagement with the UK highlights the expertise and international standing of UK staff, and due to under-investment, the missed opportunities to develop the accuracy and quality of LCF data. ONS should look to determine the extent to which solutions which have been adopted internationally could be applied to a UK context.
The lack of progress in the use of alternative and administrative data sources has impacted on the quality, accuracy, and international comparability of the surveys data, a perspective which is seen as important to users to gauge the impact of Brexit and the pandemic on UK households. Given the strategic importance and profile of expenditure data with internal and external stakeholders, ONS should address this under investment in LCF systems and development work as a matter of urgency.
The LCF improvement project should consider wider uses of LCF data beyond RPI
In March 2021, ONS launched an internal LCF and RPI Improvement project. The main aim of the project is to reduce the risk of further errors in the RPI arising from problems with the quality of LCF data. Achievement of this aim will be measured through the project’s strategic goals, including increased confidence in the quality of LCF data output among stakeholders, fewer future errors, and enhanced quality assurance. While we welcome the initiative, we consider there is a wider need to review the LCF beyond its relationship with the RPI.
The project is comprised of four main workstreams comprising:
- LCF Discovery – Covering a review of the data quality assurance processes for the LCF, Prices and Household Final Consumption Expenditure, and including a review of the of impact of the 2020/21 LCF Questionnaire changes on the processing of LCF data.
- Aggregation of LCF data – This workstream will focus on the extent to which downstream data processing risks can be ameliorated by re-positioning the aggregation of LCF data from the Social Surveys division to the Economic Statistics Group and using a strategic coding language (R, Python) to perform the aggregation.
- Review of the consumer prices quality assurance processes. The workstream will also examine the possibilities of enhanced engagement with data suppliers to minimise the possibilities of errors going undetected in downstream processing.
- The fourth element of the project, not directly related to the LCF, will focus on the RPI revisions policy, and will consider whether small errors should be corrected as part of a revisions policy.
In carrying out our review, we considered our findings in the context of the improvement project, to identify whether the areas we have highlighted for improvement are covered by the project. We noted earlier in this report that LCF data are used by a range of ONS’ internal and external users. The focus of ONS’ review, however, is the interaction between the LCF and the RPI. In the interests of promoting and enhancing the public value of official statistics, ONS should consider extending the scope of its project work to include input from some of its key external users, such as the Scottish Government, where additional intelligence could be gathered on the use and issues faced by the government in its use of LCF data.
OSR would also encourage ONS to consider the management of risks throughout the end-to-end production process as part of the LCF projects medium term work and ambitions. We note that the focus is on downstream data processing, despite the fact that often the biggest risks for quality come at the beginning of the production process. For example, OSR’s work on strengthening the quality of HMRC’s Official Statistics, highlighted the data quality challenges that statistical producers face when being supplied with data from external bodies. In the work with HMRC, OSR advocated the use of its Quality Assurance of Administrative Data Framework as a tool for managing these upstream challenges. Whilst the framework is designed for managing risks arising from administrative datasets, the principles can also be applied to survey data.
As part of the improvement project, ONS should also consider our recommendation to determine a longer-term solution for the LCF which draws on international best practice and wider transformation initiatives. We would encourage ONS to consider whether the work to move the aggregation of LCF data to a strategic coding language provides an opportunity to build in Reproducible Analytical Pipeline principles, which would free up resource in the long term to allow the statistics team more room to carry out development work.
Back to top