Making data accessible
Data provision comprises two elements:
- decisions about what data access to allow, to whom and for what purpose
- mechanisms for managing the supply of, or access to, the data
There can often be a mismatch between users’ and data providers’ perceptions of the data access process and how well it meets needs, resulting in frustration on both sides. Many data providers recognise that existing models are not always compatible with how users work and are developing new ways of working to accommodate these. The principles set out in the Code of Practice for Statistics around data provision, and our associated expectations of what these mean for data providers, provide a framework that could help to better align users and producers’ experiences. Data provision that fully supports the standards set out in the Code of Practice for Statistics requires the following:
- Communication and engagement with users – our expectations that statistics users will be informed about, and involved in, decisions about official statistics production extend to the provisioning of data. Data providers can hear most easily from their known, existing users – there can be a danger that providers are unaware of the needs of, and barriers limiting access for, potential new users. Users of open data may require more outreach to find. Many complaints that users bring to OSR about data provisioning are a consequence of poor communication about changes to data access arrangements and how data services run. Good communication and engagement involves multiple formats, for example: steering groups or user groups; regular and ad-hoc direct contacts with established networks of users; user surveys; social media; articles and blog posts on relevant websites (including the locations where open data are accessed directly); dissemination via interested networks; organised events with users; attending and presenting at external events that users will be at. Whichever modes are used, having an ongoing dialogue is critical.
- User-informed processes – data providers should seek, wherever possible, to ensure that the data provision process reflects how users work and will use the data. Understanding what users need and working to identify solutions that meet those needs are important parts of this process. For example, data users increasingly use automated tools to scrape data, but these won’t work if data are provided on sites with CAPTCHA tools, or URLs change without notification or onward direction. There will be times when users’ needs cannot be accommodated due to resources, data security considerations or IT infrastructure changes beyond the control of data providers. In these situations, it is important to be transparent about why needs cannot be met. Changes to application processes and data supply mechanisms should be developed in partnership with users to ensure they meet needs and do not have a detrimental impact on their work.
- Transparent processes – applying to access data requires proper scrutiny of various elements (for example, project purpose, public good served by using the data, ethics, researcher credentials, IT security) with systems in place to handle each aspect. Transparency about what is needed of applicants and their institutions for each element is essential. Applicants not supplying adequate information is a common reason for applications being rejected and needing to be resubmitted; data providers can help here by ensuring training and support is available to help users navigate the process. Reasons for rejecting applications must be made available to help users understand what more is needed of them, and to uphold the integrity of the decision-making process. Published information about likely timeframes from application submission to data being ready to access is also essential so applicants can plan resources accordingly (especially where staff will be recruited to use the data). This should be based on relevant service performance monitoring data where possible. Once submitted, users also need access to real-time information about the progress of their application. Clear information about costs for initial applications and any further ongoing costs associated with the data provision must be available.
- Appeal and redress – there should be clearly signposted channels for applicants to raise complaints about data provision, including the facility to appeal decisions taken about applications.
- Professional development for staff involved in data provision – the professional skills required of the staff who support data provisioning should be recognised and developed in the same way that statisticians who produce statistical outputs are supported to develop their analytical skills. It should also include developing skills in engaging users in an ongoing dialogue.
- Coordination across departments in gaining approval – government departments should be working in a coordinated way to make data available to the research community efficiently. Researchers having to get multiple approvals for linked data are exposed to inconsistent decision making and governance standards across different approvals panels – this approach can be inefficient and frustrating for researchers. The legal gateway exists for a coordinated single approvals approach across government, via the DEA, which has robust governance standards approved by Parliament. This approach provides a safe, efficient and consistent gateway for making data available to the research community.
 Examples provided in: Bacon, S. and Goldacre, B. 2020. Barriers to working with NHS England’s Open Data. Journal of Medical Internet Research. 22:1. Doi: 10.2196/15603Back to top