Photo-Abyan-Athif- Unsplash-trees-sky

Collecting and reporting data about sex and gender identity in official statistics: A guide for official statistics producers

Published:
29 February 2024
Last updated:
17 December 2024

Deciding whether to collect

OSR’s expectations

The Code states that data sources should be based on definitions and concepts that are suitable approximations of what the statistics aim to measure. In addition, the collection, access, use and sharing of statistics and data should be ethical and for the public good. 

It is the responsibility of official statistics producers to determine whether there is a legitimate need to collect data about sex, gender identity, both or neither. Producers should also familiarise themselves with any legal obligations relating to their data collections, such as the Public Sector Equality Duty for England, Scotland and Wales, and the Section 75 duties for Northern Ireland. If producers are unsure what information they should collect, they should seek advice from their Head of Profession for Statistics, Chief Statistician or Lead Official in the first instance. 

Producers should consider the principles of data minimisation as set out in the General Data Protection Regulation (GDPR), namely that “Personal data shall be adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed”. This means that producers should collect information on an individual’s sex or gender identity only if there is a need to. In addition, producers should consider collecting information on both sex and gender identity data only if there is a clear user or legal need that cannot be met by collecting either data item individually. This is especially relevant where collecting a combination of sex and gender identity data means that producers are directly or indirectly collecting information on an individual’s transgender status, which may be considered special category data in some circumstances. See ‘Legal requirements’ for more information. 

Statistics should be consistent and coherent with related statistics and data where possible. Producers should ensure that they are familiar with how producers of related statistics are collecting data about sex or gender identity. Where it is not possible to be coherent with related statistics and data, producers should give reasons for the deviation and explain any implications for use. 

Coherence is especially important to consider for datasets that are, or could be, linked together. The use of inconsistent terms or data across such datasets has the potential to cause problems when linking them together. It is key that producers are clear on the terminology and definitions being used in their data collections to support accurate data linkage and subsequent analysis of the data. 

When deciding whether to collect data about sex or gender identity, producers should consider the likely quality of the data that they will receive. For surveys, there are many factors that could impact the quality of the data collected, for example, question acceptability, low response rates, response biases, mode effects and small sample sizes. For administrative data, the quality of the data can be affected by factors such as whether the information is provided by the individual or answered on their behalf.  

Producers should be aware that when trying to capture information on small sub-groups of the population false positives will have a greater impact. By false positive, we are referring to when someone reports that they are part of the target group when they are not. The effect of these incorrect responses will be amplified in small groups, which may then affect the quality of the resulting statistics. If the data are unlikely to be of sufficient quality to meet those needs, producers should consider whether it is appropriate to collect these data. 

There will be times when producers are not able to meet the needs of everyone who has an interest in the collection of data about sex or gender identity. One example is when the needs of the data users differ from the needs of the individuals providing their data, such as survey respondents. For example, a survey may be collecting data about sex, but some respondents may report that they would prefer to provide their gender identity. We acknowledge that these situations can be difficult for producers to navigate and that there may not always be a clear answer that will meet the needs of everyone involved. This may be especially true for statutory data collections where there is little ability to change the collection. In these instances, producers should clearly explain why certain information is or is not being collected and be transparent and open about their decision-making processes and the evidence and priorities used to inform their choices. 

Questions for statistics producers to consider

Do you need to or want to report on sex?

While it is likely these data will already be collected, if you are looking to report on sex you should review how data are captured to meet your reporting requirements. 

Ask yourself: 

  • Do you have a clear reason to collect information on sex? 
  • Are you clear on how you are defining sex? 
  • Does your definition of sex meet the needs of the different users of your statistics? 
  • Do you understand what data you can, or should, collect about an individual’s sex to comply with relevant legislation? 

 

Do you need to or want to report on gender identity?

If you are looking to report on gender identity, it is likely that you will need to introduce a new collection or source these data. This may require amending your data collection or asking a new question. However, you should also consider if the information is already captured through other sources that you can use, for example, through data sharing agreements where appropriate. 

Ask yourself: 

  • Do you have a clear reason to collect information about gender identity? 
  • Are you clear on what you mean by gender identity? 
  • Does your definition of gender identity meet the needs of the different users of your statistics? 
  • Have you considered relevant guidance and legislation relating to the collection of gender identity? 
  • Have you considered the likely quality of the data and determined that it will be sufficient to meet the needs of your users? 

If you are considering collecting data about both sex and gender identity, you should ask yourself all of the questions set out above. In addition, you should consider whether you will be collecting special category data, and if so, you should evaluate what additional arrangements you will need to meet legislation requirements. 

Back to top
Download PDF version (642.11 KB)