OSR has launched a Review of the Code of Practice for Statistics.
The first Code of Practice for Statistics was published in January 2009. In February 2018, OSR published version 2.0 of the Code of Practice for Statistics after a significant review and refresh of the standards we expect for the production of data and statistics. Since then we have seen a lot of change in the data landscape, increased desire for statistics from users, and changes in the ways statistics producers are working.
We think this is an ideal time for us to review the Code to make sure it remains relevant and to identify opportunities for us to make improvements.
From September through to December, OSR will be seeking to gather feedback from stakeholders on the Code of Practice. We will be running a range of online sessions for all interested parties – from statistics producers to statistics users alike – to explore some of the key topics which might be relevant when thinking about the Code.
Share your views through our call to evidence
OSR is seeking evidence on the Code of Practice for Statistics.
The second edition of the Code of Practice for Statistics was released in February 2018. It established a framework for the standards of statistics production grounded on three core principles or ‘pillars’:
- Trustworthiness – confidence in the people and organisations that produce statistics and data
- Quality – data and methods that produce assured statistics
- Value – statistics that support society’s needs for information
Since that time the Code has been firmly embedded into the work of official statisticians and by a community of practitioners beyond official statistics.
Our review, The State of the Statistics System, has highlighted how well over recent years producers have responded to urgent needs for data and statistics and have continued to innovate in challenging circumstances – such as during the COVID-19 pandemic and since Russia’s invasion of Ukraine in February 2022. However, declining response rates, sample biases, and data privacy concerns can have a significant impact on the quality of statistics. In a wider landscape of technological advances, statistics need to remain relevant, accurate and reliable – the increasing use of new and alternative data sources and advances in technology are opportunities for the statistical system to embrace.
The role of the Code is to provide a clear steer for those producing statistics on the standards to be applied to ensure that statistics command public confidence. We would like to hear from stakeholders across a wide range of settings on their thoughts about the suitability of the Code and on how it can be adapted to meet the challenges and opportunities on the horizon. The information provided will inform the OSR’s decision making on whether changes are required to the Code. The call for evidence will also inform how we support organisations that produce statistics who wish to apply the standards of the Code in a voluntary way.
Respond to our call for evidence now!
Please submit your response to this call for evidence by completing this MS Form (best for individual responses).
________________________________________________________________________________________
Alternatively you can download a Word version of the call for evidence (best for group and combined responses).
Email your Word responses to regulation@statistics.gov.uk or return by post to:
Office for Statistics Regulation, UK Statistics Authority, Statistics House, Cardiff Road, Newport, South Wales, NP10 8XG
________________________________________________________________________________________
This call to evidence runs from 18 September 2023 to 11 December 2023.
If you have any comments or feedback about the way this call for evidence has been conducted, please email: regulation@statistics.gov.uk.
Upcoming Events:
OSR’s Review of the Code of Practice for Statistics’ – In person event at the Royal Statistical Society: 30 November 2023
The RSS and OSR are holding a joint event to discuss the OSR’s review of the Code of Practice for Statistics. There will be four ten-minute presentations – including by Ed Humpherson (head of OSR) and Paul Allin (RSS’s honorary officer for national statistics) – followed by an opportunity for discussion. The OSR is keen to hear RSS members’ views on the code, so please come and have your say.
The event will start at 4pm at RSS London, 12 Errol Street, EC1Y 8LX. Refreshments and networking will begin at 3.30pm. Book a place now.
Previous Events
Below you will find summaries of events related to futureproofing the Code of Practice for Statistics. Expand the event boxes to see more.
On 13 September 2023, the Office for Statistics Regulation (OSR) held a launch event for our review of the Code of Practice, to ensure it continues to effectively serve the production and regulation of government statistics. One of our newly recruited Regulators Luke Boyce summarises the event and what was discussed.
When I first joined OSR, I was excited to hear about the upcoming Code review project, to ensure it remains relevant. Coming directly from a statistics team elsewhere in government I was already familiar with the Code of Practice and its importance in relation to statistics production in government.
However, getting to apply the Code in my regulatory work these last four months has given me a newfound appreciation for its importance, especially the value pillar. Without value, statistics in government can’t serve the public good, but without a Code that reflects the current statistical system, this mission is difficult to achieve.
Clearly I wasn’t the only one excited to hear about what’s next for the Code of Practice for Statistics, with almost 300 people, both from across government and the general public, in attendance. The last full refresh of the Code of Practice was 5 years ago in 2018 and the consensus among guest speakers, OSR colleagues and those that participated in the Q&A, was one of strong enthusiasm for adapting the Code to underpin a statistical system that is rapidly changing.
Topics covered Artificial Intelligence (AI) and its effect on society, the rise in administrative data use, live data and dashboards, and data linkage across government. Some of these topics will also be covered in later events during the Code review so keep an eye out for them if you’re interested.
It was great to hear from a variety of speakers, both from inside and outside government and how the Code of Practice impacts their work.
Tracey Brown, Director of Sense About Science talked about their mission to increase the public’s knowledge of evidence, directly in line with the Code of Practice, especially the trustworthiness and value pillars. She talked about the public’s increased interest in statistics in the post pandemic world and why this puts increased emphasis on official statistics serving the public good and why it’s important that we update the Code in line with the world we now live in.
Catherine Hutchinson, Head of the Evaluation Task Force at the Cabinet Office talked about how trust in government, and by decision makers in government, to use the evidence provided, can also be heavily reinforced through intelligent transparency and tackling the misuse of statistics, using the pillars of the Code. She explained how the evaluation of policy and operational decisions prior to full national or larger scale implementation is important as it ensures public money is spent effectively. To do this, quality statistics and evaluation reports that the public have free access to are required which can be enabled by an effective code of practice.
Stephen Aldridge, Director for Analysis and Data Department for Levelling Up, Housing and Communities, described how the Code of Practice informs everyday work for every analytical team in government, and highlighted the need for a Code that takes into account the appropriate use of new technologies and techniques, such as AI and cross government data linkage to enable analyst to carry out new innovative work.
He also highlighted how the Code can support all analytical work, including published management information. He argued that flexibility in the application of the Code is important, to ensure it is easier to apply to different types of statistics including those outside government. He added that data dashboards are a great new emerging tool in government statistics that allow the public to access live data rather than having to wait for infrequent releases – however, these dashboards can sometimes miss vital insight and commentary. A Code refresh could emphasise the importance of demonstrating trustworthiness outside of a traditional bulletin and allow official statistics to exist in a live format.
At the end of the event, there were many questions, and enthusiasm for the Code review. Questions included how the review will address the growing interest in real time data, allow the development of statistics that serve the public good and are not just tied to policy priorities, and how the Code of Practice applies to published government figures that aren’t produced by statisticians.
The launch event was just the start of the Code Review. On Monday 18 September 2023 OSR launched an online Call for Evidence for you to share your feedback with us about the Code. There will also be several more panel events, focussing on areas including data quality, data ethics and AI, and user demands. I’m really excited to attend these events and I hope to see you there.
Panel 1 recording: Maintaining data quality
Maintaining data quality – an event summary
Data quality is critical for all statistics – get it wrong and there is a massive risk that users will be sent off in the wrong direction and perhaps seriously misled. It was the subject for OSR’s first Futureproofing the Code panel session. We are keen to hear from experts about their perspectives of some challenging issues that are influencing statistical practice today.
We had a great line up for the panel with: Iain Bell (National Director for Public Health Knowledge and Research at Public Health Wales), Sarah Henry (Director of Methodology & Quality at the Office for National Statistics), Roger Halliday (Chief Executive of Research Data Scotland), and rounded off by the pre-eminent Professor David Hand (Imperial College and Chair of the National Statistician’s Expert User Advisory Committee).
These are experts with wide ranging experience that provided a rich background to answering our exam question: ‘In the light of concerns about survey response rates, use of personal data, and wider perceptions of the loss of trust in institutions, what can be done to manage risks to data quality?’
Iain gave us an important reminder of the importance of transparency with quality to build others’ confidence in the data and stats. He kicked off with reminding us that there are no perfect data and there never has been. The statistician’s job is to find out the pros and cons of data sources. Iain emphasised the importance of having better clarity for data that institutions actually hold and need. He said that statisticians need to take greater responsibility for data and transparency in the ways they work – they need to apply the fundamental principles of being open, admitting when things aren’t as high quality as they’d like.
For Sarah, data quality is close to her heart and she has a passion for surveys. She strongly defended their value and emphasised their continued importance. While there are a wide variety of types of sources (including many new ones such as from barcodes), the more traditional sources such as surveys are still essential. Understanding the sampling frame is very important and gives us confidence in the validity of the data, even when we have smaller samples. Analysts need to be able to quantify and quality assure the data – and they need more independent sources to achieve this and verify the data. Survey data can also help analysts better understand administrative data, as during the Covid pandemic. Sarah emphasised the benefits of survey data in establishing a stronger connection with the respondent. Doing so can help improve response if we better explain the purpose for the data collection. Getting a rich granular picture for the data can better inform decision making.
Roger focused on improving data quality and statistics through use and sharing. He emphasised that having an independent source is not enough to accurately verify quality and there can be high risks in decisions being made on poor quality data. Roger highlighted the benefits of analysts getting out of the office and actually visiting those providing data – he has found putting a face to the data is valuable. Using and combining sources can help address poor quality such as bringing together data from public bodies, and developing a plan of what can be done now, as well as in the medium and long term, is important.
David reminded us that how quality is viewed is centrally tied to purpose – what is good data for one purpose may be poor for another. Statistics are not a static output but changing, and it could be worth considering moving from being process oriented to being product oriented. David felt that there is a need to place more emphasis on local data and to connect to users. There are new opportunities with business productivity rising, and the use of new data types and sources that are encouraging new technologies. Adoption of these technologies such as AI and language models through ChatGPT, also influences what users can do and need —these change over time. A key question for users to address is what we are trying to do and what we want the data for.
Panel 2 recording: Data Ethics and AI
Data ethics and AI – an event summary
We heard about the challenges of deep fakes and scientific misinformation, as well as the seven deadly sins in the big and open data house. And, thankfully, we learnt some steps that can be taken to counter them including the benefits of the UK Statistics Authority’s data ethical principles that fit neatly with the Code of Practice for Statistics.
Our speakers were Areeq Chowdhury from the Royal Society, Sabina Leonelli from the University of Exeter, and Helen Boaden, chair of the National Statistician’s Data Ethics Advisory Committee.
Areeq talked through some of the current challenges with scientific misinformation and around AI, highlighting the societal harms, the need for honest and open discussion and support for fact checkers. He flagged the importance of challenging assumptions and holding platforms to account. The need to engage the public is not limited to times of emergency but should occur continuously. He highlighted that there can be an over correction, with a tendency for organisations to be cautious and over-apply data protection regulations. Other technical solutions are important too – using standardisation to support wider data use and establishing trusted research environments.
Areeq illustrated the challenges of generative AI and the creation of deep fakes. He highlighted some ways to mitigate the impact through establishing the digital content provenance through verification. The Royal Society is involved in a red team challenge of large language models to test the guard rails. Areeq also emphasised the importance of looking across disciplines to consider the AI safety risks and highlighted the difficulty for multilingual communities in receiving and understanding information. Watch the recording of the session to see Areeq’s own deep fake!
Sabina highlighted the rise of inductive reasoning and the logic of discovery as “data accumulation”. It suggests the more data the better to generate evidence and knowledge, with comprehensive data collection being a form of control. There is a wide appeal and perhaps mythology surrounding big data. Data are made, not given; they are partial and not comprehensive, and qualities do not always reduce to quantities.
Sabina described the seven deadly sins for big and open data houses: conservatism (the problem of old data), a house of cards (with unreliable data), convenience sampling (with partial data, selective and reinforcing inequalities in the digital divide), self-interest (the problem of dishonest and lack of regulation applying to the dissemination of data), environmental damage (unsustainability and pollution from storing masses of data), and global inequity (with the problem of unfair data). She emphasised the importance of debunking big data mythologies.
Helen suggested that there is a lot that can be learnt from applying data ethics principles both for researchers and for OSR in strengthening the Code. She introduced the principles from the Centre for Applied Data Ethics in the UK Statistics Authority that can be used by researchers considering the use of AI. She emphasised the benefit of using the principles to minimise the potential harms, as well as enabling researchers to efficiently analyse data. The principles also help in managing confidentiality and data security, to ensure the appropriate agreements are in place, promoting public engagement in the use of the research and identifying the benefits. They help underscore the importance of being transparent about the access and sharing of data. The principles can be applied and used to consider a huge range of ethical concerns in different ways and are frequently applied to novel research, including various elements of AI.
Helen emphasised that understanding the context around the data and research are important for effective ethical practice. It also relies on collaboration which can be international as well as national – AI goes beyond boundaries and there is a shared responsibility to use technology both ethically and appropriately.
Our third and final online session saw a panel of speakers, including Sir Ian Diamond, discuss how official statistics can remain relevant with greater demands for real-time data and the use of automation in data production.
Event Recording and summary coming soon.
We hope that our events will encourage you to complete our online Call for Evidence which will remain open until 11 December 2023.
Get in touch
If you would like to contact us regarding the review, please do email us at regulation@statistics.gov.uk