Chapter 2: Quality
Managing local authority data quality
LAs across England use a variety of different IT systems, each developed by different IT suppliers, to collect the data required for H-CLIC and submit to DLUHC via DLUHC’s secure online data collection platform DELTA. LAs have access to an instant case-level error report when they submit their data to DELTA allowing them to fix case-level issues before resubmitting. After submission, data are aggregated to LA level and a quality assurance report is sent back to each LA based on what they have submitted. The general view from users we spoke to is that data quality appears to have improved over the past few years since the introduction of the H-CLIC data collection.
LAs we spoke to across England as part of this assessment reported good relationships with the homelessness statistics team at DLUHC, and we heard from some LAs that prior to the pandemic, the DLUHC team had visited them in person. The statistics team was also described as being helpful with any queries or issues and that they were timely in their responses.
From our engagement with LAs we have seen a mixed picture in terms of some LA IT systems interacting with DELTA more smoothly, while some LAs are still struggling with their systems not being compatible with the H-CLIC collection and, up until the April-June 2021 quarter, were still submitting some of their data using the old P1E forms. We also heard from LAs that some had implemented the changes to their IT systems in readiness in 2018 while others did so much later, in part due to resources, and this has had an impact on the success of the transition from P1E to H-CLIC. Some LAs told us that these IT issues have also meant that they are unable to answer key questions internally around statutory homelessness.The LAs we spoke to that are still having difficulties with their IT systems and that do engage with the team at DLUHC, were positive about the contact and involvement that they have with them.
The old P1E collection should have ceased by April 2020 however due to the pandemic and the added pressure this put on LAs, the deadline was extended by a year. The latest Statutory Homelessness statistics release in October 2021 which covers the period April-June 2021 is based solely on H-CLIC data returns with some LAs still unable to return accurate data and in some cases unable to return any data at all.
The comparability between H-CLIC and P1E is covered in detail within the Technical Notes published up to the January-March 2021 quarter and we note that this information has not been included in the latest version of the Technical Note (for the April-June 2021 quarter). Useful guidance was provided on the comparability and differences in the data collection through P1E compared to H-CLIC, alongside helpful flow charts showing the journey through the homelessness system before (which used P1E) and after (which now uses H-CLIC) the introduction of the Homelessness Reduction Act. Users may not know to refer back to historic Technical Notes to find this information.
The January-March 2021 Technical Note states that “Around 3% of local authorities submitted aggregated temporary accommodation data through P1E only. Authorities with the largest temporary accommodation usage are significantly represented among those reliant on P1E, which is why 18% of the national total in temporary accommodation continues to be provided on the pre-HRA collection system.” For consistency with information previously published it would be helpful to know for this latest quarter (April-June 2021), since the discontinuation of P1E, what percentage of the national total in temporary accommodation has now been imputed if those LAs who were previously reliant on P1E have been unable to return the data through H-CLIC. For transparency it is important to show what impact the stopping of P1E has had on LAs’ ability to return accurate data, and in turn on the level of imputation for the national totals.
We heard from some users that the reasons why some LAs cannot submit data are not made entirely clear. However, users did say they found it helpful that missing LAs were clearly marked in the published data tables.
We heard from the statisticians that there are mixed levels of engagement across LAs in England. There are some LAs that have not returned any data for previous quarters, or who have failed to return accurate data for several quarters. Users we spoke to were concerned that missing data could be skewing the overall homelessness picture for England, or that missing data in one LA meant that a complete picture was not available for a larger area. For example, one homeless charity we spoke to pointed out that data for one LA was missing from Greater Manchester’s Homelessness Prevention Strategy. Another user we spoke to told us that they would like to create their own interactive LA dashboard but had been unable to do so due to LA data regularly missing.
The new performance dashboard released by DLUHC in October 2021 gives each LA an overall red, amber, or green indicator based on four quality measures (also each individually indicated as red, amber or green (RAG)): timeliness of uploading their data; what percentage of their cases were submitted without errors; whether the submitted data indicates that the LA is completing all of its cases; and whether the LA provided data that were accurate and had been published.
An explanation of the RAG marking for each of the four indicators is provided, with the overall quality RAG rating being determined by the lowest RAG rating of the four measures. In line with the Code, we would expect the statisticians to be transparent about the methods used to determine what the rationale is behind the cut off for each percentage to determine a red, amber, or green and which data sources are used for these, to help to ensure appropriate interpretation and use by statistics users.
The dashboard helps to identify which LAs are not returning accurate data, highlighting the issue of varying levels of data quality between LAs. The team told us that it will use the dashboard to prioritise which LAs to support in terms of the size of their caseload and relative impact on the statistics of them not returning data. However, they noted that many of the issues are system specific to LAs.Back to top
Methods and dealing with missing data
LAs are given a minimum of six weeks to submit accurate data. For those LAs who have not reported accurate data or any data, these missing figures are imputed so that a representative figure at a national level is available. Where LAs have provided a missing or incomplete breakdown their previous submitted data are used to estimate the values using a multiplier based on the quarter-on-quarter change observed in groups of local authorities. The three groups used for imputation are London Boroughs; Unitary Authorities combined with Metropolitan Districts; and Shire Districts.
The team told us that it had engaged with the GSS Best Practice and Impact Division for help on making improvements to the imputation method, and any changes to the release are discussed in advance with the Housing sub-group of the Central Local Information Partnership (CLIP), allowing members, some of which are LA representatives, to provide feedback.
Some users expressed confusion about small differences between totals and summed columns as explanations around suppressions and imputation are not as clear as they could be. Similarly, where some categories are not mutually exclusive one user expressed that more clarity is needed on when you can and cannot expect columns to sum.
Some users and data suppliers we spoke to raised concerns that there may be LAs that are not recording their data on a consistent basis (for example demographic characteristics) and that some LAs may not be asking certain sensitive questions such as gender identity and sexuality, or there could be differences in the way support needs are recorded – for example only coding the main support needs rather than all that apply. Equally, applicants may not be comfortable supplying sensitive information about themselves. This could be leading to the overuse of the categories ‘other’ or ‘not known’ and affecting the robustness of the data sets.
Some LAs also shared with us concerns that barriers to submitting robust data included staff training where some local housing officers were struggling with the new H-CLIC questions compared to what was collected through P1E. Another possible barrier was where every data field must be completed before submission, some of the information input into some fields was questionable for example having to provide a National Insurance number when the applicant did not know what theirs was. Fields that aren’t essential to homelessness applications are likely to be of lower quality, and there is a risk this may also be the case for any additional data collected.Extending the data currently collected to enable data linkage may also affect the amount of QA that the LA staff have to complete.
Public value from these statistics will only be maximised with returns from all LAs. In their efforts to engage LAs that are not returning accurate data and enable all LAs to return data through H-CLIC, we encourage the team to use the guidance set out in ourQuality Assurance of Administrative Data (QAAD) framework to guide its understanding about the limitations of all stages of the data process, and to consider the level of assurance required, ensuring that its processes and those of its data suppliers, are appropriate. It is encouraging to hear from the statisticians that they are planning to use the QAAD toolkit to review data quality.
Requirement 4: To enable all LAs to return data through H-CLIC, and to help drive improvements in the quality of the data returned, DLUHC should:
- work with LAs that are unable to provide accurate H-CLIC data, including those not currently engaging with the team, to gain a better understanding of the specific barriers that they face, and overcome outstanding issues
- facilitate the sharing of best practice between LAs in terms of successful approaches to submitting data for those with similar IT systems, or those considering alternative systems, so that lessons can be shared more widely
- review its assurances around the quality of data collected from LAs, including variability in quality across different variables, informed by engagement with LAs about their data quality management approaches, and the practice areas within the QAAD toolkit
- publish a plan setting out its proposals and timelines for addressing 1), 2) and 3).
Communicating methods, quality, extent of revisions and uncertainty
Since our previous assessment in 2015 we have seen improvements in the supporting information made available to accompany the statistical release, for example the Technical Note, which is published each quarter. The Technical Note itself covers some useful information on areas such as the data collection, data quality, limitations of the data, how missing data are dealt with, the revisions process, and related statistics including comparability with the other UK countries’ homelessness statistics. The Technical Note includes a link to DLUHC’s wider quality guidelines in line with the European Statistical System’s quality dimensions (Relevance, Accuracy and Reliability, Timeliness and Punctuality, Accessibility and Clarity, and Coherence and Comparability). However, these dimensions are not reflected in the Technical Note. No Technical Note is produced to support the annual publication, and so users of the annual publication could miss important supporting information.
The latest published Technical Note (for the April-June 2021 quarter) has some improvements including new information about the weighting method used. However, as mentioned in para 2.7 of this assessment report, some of the value has been lost by the removal of information and guidance on the temporary accommodation data such as the chart showing the percentage of the national total in temporary accommodation by submission method. The Technical Note also isn’t clear on the level of data validation carried out at the LA level, which was a concern raised by some users. Information about potential variability in data quality between LAs and across different data fields and any impacts for interpretation and use could be clearer.
Requirement 5: To enhance user understanding on the quality of the statistics, DLUHC should expand the published information on data quality, in line with broader quality measures covered in the DLUHC quality strategy, and its learning from applying the QAAD toolkit, to include:
- further information, in line with what was previously published on comparability between P1E and H-CLIC data, and the impact that removing P1E has had on data quality and the levels of imputation used, including on temporary accommodation figures
- clarity around any limitations or quality issues identified through further engagement with LAs and from applying the QAAD toolkit, and how these have, or will be, addressed or mitigated
- clear communication around the extent of uncertainty for different H-CLIC variables, and how these relate to the red, amber, and green dashboard quality indicators, to help to ensure appropriate interpretation and use.
The Technical Note provides information on the revisions policy covering scheduled and non-scheduled revisions and how these are dealt with. The document states that there are no scheduled revisions. However, the team told us that data are revised every quarter, and back across the previous year, at year end. Users we spoke to said it was not clear how far back revisions are made and whether there is any cut-off for when data are extracted from DELTA. Users expressed an interest in seeing the size and extent of revisions between the quarterly and annual releases to help them gauge the level of uncertainty in the data, for example through a published revisions table.
Requirement 6: To support users in the appropriate interpretation of the statistics, the team should provide:
- clarity on when revisions are counted as scheduled or non-scheduled in line with what happens in practice
- clear information on how far back revisions are made, and the nature and extent of revisions, for example by providing a revisions table.