The assessment of the Office for National Statistics’ Admin Based Population Estimates: Independent expertise

Published:
15 July 2024
Last updated:
15 July 2024

Uncertainty

34. The uncertainty of the data inputs is directly used in the DPM through the model parameters (e.g., dispersion parameters of population component rates, Eq. 4.4 in Elliott & Blackwell 2023, are informed by the ONS estimates of standard errors). They are therefore crucial for providing the uncertainty measures of the ABPEs derived from the DPM. There are two avenues of quantifying uncertainty within the DPM: (i) via assumed known parameters (variance, dispersion) where this uncertainty is derived for each source externally, such as for the 2011 Census base (Point 8); or (ii) via estimable parameters that capture variability of the data. The latter approach requires formulating prior distributions (“priors”) for the uncertainty parameters. Such priors may be uninformative (i.e., driven purely by data) or informative, e.g. based on the externally derived uncertainty and/or informed by demographic expertise. The Bayesian demographic accounts framework requires informative priors (Bryant & Zhang 2018, Taglioni 2019). A potential risk with the currently used option (i) is that the uncertainty constructed for the data inputs externally is propagated in the DPM and so the quality of the final estimates depends on that uncertainty assessment. A need for a sustainable framework for quality measures of the DPM inputs and administrative data has been acknowledged by the ONS (ONS 2023a, ONS 2023b: table A1). I fully support such a pledge. I further recommend that the framework is extended to the ABPEs derived from the DPM. Further, as part of standard model checks (Gelman et al. 2013; Bryant & Zhang 2019) sensitivity analyses should be carried out testing the sensitivity of the ABPEs to the assumed uncertainty parameters for each of the data inputs.

35. The ONS has published an example of such sensitivity analysis for a synthetic local authority (ONS 14/07/2022), where they demonstrated that the model-based ABPEs can be closer to SPD or to MYE, depending on the relative value of the uncertainty (precision) parameter. It is important to include such sensitivity tests in the workflow for the final estimates for all local authorities as the quality of inputs may vary between and within sources over time, age, or across local authorities. Case in point, it is acknowledged that some LAs have “time lags in the accuracy of administrative data” due to high levels of migration, high percentage of rental houses or being urban areas (ONS 28/02/2023d). Further, the sensitivity analysis should specify the meaning of the assumptions on precision, e.g., in terms of coefficients of variation around the means, rather than stating that one data source is more precise than the other.

36. Because of these differences between local authorities, the future versions of the DPM may also permit differentiating the precision of data sources between the LAs, depending on their characteristics. This might be done by creating a typology of LAs that share common characteristics and an introduction of hierarchical components in the DPM that capture these characteristics. In this context, it is relevant to engage with the stakeholders at local levels (local authorities) to elicit any insights that they may offer in terms of characteristics of the population that may not be captured well by the administrative sources. The ONS has carried out such consultation with 14 Local Authorities, where the results from the DPM were produced and compared with the 2021 Census (ONS 23/11/2022). This exercise prompted a revision in the sub-model of the DPM for migration, as specific age groups were not estimated as expected. If the DPM-based ABPEs were to become official population statistics to be used for policymaking at a local level, it might be of value to consult the estimates and any major changes and updates in the methodology with those key stakeholders, explaining how these updates may affect the estimates and what the risks and benefits of the model updates are. Feedback from stakeholders may also lead to future revisions of the methodology.

37. As it is presented currently, the DPM does not account for all sources of uncertainty. For example, the uncertainty around ABPEs produced in 2020 (ONS 27/07/2020) was based on the variability in ABPEs for “similar” local authorities scaled to the 2011 Census – but without acknowledging the uncertainty of the census estimates (Point 8). Another example is uncertainty related to probabilistic linkage used in creating the DI and then SPD – a key input to the DPM (Point 10) does not seem to be reflected in the uncertainty measures in the DPM. Simultaneously, criteria for the quality of the estimates in terms of the width of the confidence/credible interval relative to the estimated population size are being implemented (ONS 2023c). This creates a potential risk of underestimating the uncertainty of the ABPEs and, thus, giving users a false sense of precision of the estimates. I recommend that reported model-based ABPEs and their precision are accompanied by a statement on the potential sources of uncertainty that are unaccounted for and, where possible, an assessment of their importance in a given situation.

38. The measures of uncertainty, of both the DPM inputs as well as its outputs, can pose a challenge for interpretation by the various stakeholders. The ONS has demonstrated a good understanding of the need for communicating uncertainty that follows the recommendations of the Office for Statistics Regulation (2022). One of the approaches to a better representation of uncertainty (and more widely, data quality) is an assessment in terms of the risks of potential cost of under- or over-estimating population counts/rates in a given source or by the DPM. This would also help identify which population characteristics may be of greater concern to stakeholders (cf. Bijak et al. 2019) and could inform where more research is needed to better understand the source of the uncertainty in data inputs.

Back to top
Download PDF version (354.52 KB)