Annex A: OSR Approach to Compliance Checks on Uncertainty
OSR uses the two axes of “what is said” and “where it’s said” to help think about whether information about uncertainty is adequate taking into account what kind of decisions or further analysis the statistics might be used in, and by whom (including their level of expertise). In many cases there’ll be different types of people making different types of decisions, which we bear in mind. In some cases we may need to make some assumptions about the types of decisions made, the types of people or organisations making them, and their level of understanding of the statistics. We find it helpful to think specifically about the potential for misuse (for example drawing a conclusion or making a decision that may not be borne out if we had perfect data) in any given context.
What is said
- quantitative description of uncertainty. This would often be expressed as confidence intervals, margins of error or sampling errors, modelling errors, statistical significance levels. Or could be presented visually such as by using error bars or fan charts or some other shading to represent the likelihood and magnitude of uncertainty.
- description of the likelihood and potential magnitude of revisions. This could either be anticipating (numerically or descriptively) future revisions, or provide an analysis of past revisions.
- qualitative description of possible sources of uncertainty. This category would include basic statements reminding the reader that the data are from a sample survey, and could include descriptions of definitional issues, coverage issues, response biases, any issues relating to time lags etc. It may also include details about the likelihood, direction and magnitude of any possible biases.
- use of words like “estimate”, “approximately”, “about” or “around” that express some uncertainty in estimates. Words like “probably”, “possibly” or “may”, particularly when comparing possible changes over time or differences between categories might be used. Use of rounded numbers also helps avoid spurious accuracy (but the motivation here might be for such as confidentiality protection).
- no mention of uncertainty. This would be where the statistics and data are presented as if they were absolute facts “The unemployment rate was 4.5%”, “GDP grew by 0.2%” etc
Where it’s said
- visible and prominent (you’d have to try hard to miss it) – uncertainty is either presented up-front at the start of the document (or each section, for example) or is presented alongside the numbers or charts
- there, but could be more obvious. This is the kind of scenario where there’s a footnote that might be missed on a casual read, or where there is information in the notes at the end of a document
- invisible or so hidden that you’d have to try hard to find it. This basically covers scenarios where there is no information about uncertainty at all, or relies on a detailed read of a document, following links that aren’t very visible or well-named etc.