Understanding the complexities of crime statistics

In this blog, our Head Statistics Regulator for Crime and Security discusses the difficulties in understanding and interpreting crime statistics, and what OSR is doing to support producers in improving the quality of crime statistics for England and Wales.

Crime statistics are complex

Statistics on crime are widely used by politicians, governments, researchers, the media, and the public to try to understand the extent and nature of crime. Often, the questions that people want to know the answers to seem relatively straightforward: Is crime going up or down? What types of crime are most common? How reliable are crime statistics? Is it possible to measure all crimes? But answering these seemingly simple questions can be surprisingly difficult.

Understanding and interpreting crime statistics for England and Wales is complex. This is mainly because there are two data sources on crime: statistics from the Crime Survey for England and Wales (CSEW), a household survey of individuals’ experience of crime; and police recorded crime statistics, which capture the number of crimes reported to and recorded by the police. These statistics are published quarterly by the Office for National Statistics (ONS).

Both data sources have their strengths and limitations. The CSEW is the best source for understanding long-term trends in crime covered by the survey. This is because the survey methods have changed little in the last 40 years and the survey is not affected by changes to police crime recording practices or people’s willingness to report crime to the police. In addition, the survey captures crimes that aren’t reported to the police.

On the other hand, the survey doesn’t capture all crimes. For example, as it’s a household survey, it doesn’t capture crimes against businesses and organisations such as shoplifting. There are also challenges with the survey’s response rate, among other factors that affect the quality of the statistics, which led to the temporary suspension of their accreditation.

The police recorded crime statistics are a better indicator of police activity than trends in crime, because many crimes are not reported to the police. However, the statistics do provide insight on some higher-harm but less-common crimes such as homicide or knife crime, which the CSEW does not cover or does not capture well.

The police recorded crime statistics also cover a broader range of offences than the CSEW because the police also record crimes against businesses and organisations and crimes against society and the state, such as drug offences and public order offences. And the police recorded crime statistics are more granular than the CSEW statistics – the number of offences is broken down by police force area.

Due to these strengths and limitations, it’s important to look at both sources together to get the most complete understanding of crime in England and Wales. ONS’s Crime trends in England and Wales article provides a good guide on how to interpret both sources. It explains which source is best for which purpose. For example, it recommends using CSEW statistics to look at trends in fraud but recommends using police recorded crime statistics to look at trends in knife crime.

Our work on crime statistics for England and Wales

Crime statistics are a priority area for our regulatory work. It’s been a particularly busy period for regulatory work on crime statistics, and the coming months will continue to be busy. The quality of the statistics has been our main focus. One of the questions we’ve been focused on is ‘How reliable are the statistics?’.

Today, we published a detailed report on the quality of the police recorded crime statistics for England and Wales. Our review took stock of how data quality has improved since 2014, when we removed the accreditation of the statistics due to quality concerns. We found that police forces have made significant improvements to crime recording in the last ten years. This has given us greater confidence in the quality of the data. But we found some gaps in the Home Office’s oversight of police force data quality and in ONS’s communication of quality that we have asked to be addressed.

One subset of the police recorded crime statistics that we didn’t look at in our review is fraud and computer misuse statistics. That’s because the process for recording these crime types is different from that used for other crime types. We’re aware of the increased public debate about the scale of fraud and its impact on victims. To give this topic the attention it deserves, we’re doing a separate review of the quality and value of fraud and computer misuse statistics. We’ll publish the review later this year.

Like other UK household surveys, the CSEW has suffered from a lower response rate since the pandemic, which has impacted the quality of the statistics. We’re reviewing the quality of the CSEW statistics soon with a view to reaccrediting them.

We recognise that crime will be an important issue in the upcoming UK General Election. To support the appropriate use of crime statistics, we will be publishing a ‘What to watch out for’ explainer at the end of May that provides some tips and advice and sets out some of the common mistakes in public statements about crime that we have seen. It explains that it’s always better to look at the CSEW and police recorded crime statistics together to get an overall picture of crime in England and Wales.

Through this range of work, we are gaining a good understanding of the current state of crime statistics for England and Wales, helping us to support public confidence in the quality and value of the statistics and to continue to promote their appropriate use.


Related Links:

The quality of police recorded crime statistics for England and Wales

 

The new knife crime methodology making police analysts jobs easier

In our latest blog, OSR Regulator and former Police Information Analyst Ben Kendall Ward discusses how the Home Office’s National Data Quality Improvement Service (NDQIS) is improving the way police statistics are recorded.

Prior to joining OSR I worked for the police as an Information Analyst for seven years. One of my duties was to collate knife crime data and send it to the Home Office on a quarterly basis. Police Officers would input crimes onto a system, flagging if a knife or sharp instrument was involved in some form, but the quality was poor and officers often forgot to add these markers onto the crimes.

As a result, I would often manually read through all the relevant crimes for each financial quarter to determine if a knife or sharp object had been used and, if there was a threat, how likely the threat was.  Reading through hundreds of records and marking them before collating them to send to the Home Office was a laborious process.  

I started at OSR seven months ago, and one of my first projects was working on a review of the Office for National Statistics (ONS) and Home Office’s knife-enabled crime statistics for England and Wales.  

Realising that the quality of police recorded data on knife crime and other so-called ‘flagged’ offences was poor, the Home Office set up a National Data Quality Improvement Service (NDQIS), which looked at using computer aided classification to tackle this issue starting with knife crime as a guinea pig. Unfortunately, this tool wasn’t fully developed until shortly before I left the police. I would have loved to have this available when I started, since it would have made my job so much easier.    

The NDQIS tool works by first checking if the offence is one which the Home Office considers when looking at knife-enabled crime (for example if the crime was burglary, the tool wouldn’t check any further, since it’s not an offence the Home Office considers for knife crime). The tool then scans the details of the crime including the free text fields that include the detailed events of what occurred on the crime as recorded by the call handler or police officer, looking for key terms like “stabbed the victim” or “threatened with a knife”.   

Once the NDQIS tool has done this it categorises each crime into three different categories:   

High Confidence: where there is a high degree of certainty a knife/sharp object was used in the crime and the record doesn’t need further review. 
   

Low confidence: The tool is uncertain whether a knife/sharp object was used in a crime and therefore requires manual review. 
   

Rejected: The tool has determined a knife/sharp object was not used in the crime or that it is a possession only offence, which are excluded.  

The Impact of NDQIS

This is a really useful development that reduces administration and burden for police officers, omits the need for Information Analysts to manually input data, and also ensures the statistics are as accurate and as valuable as possible.  

We think that other organisations can really learn a lot from this work, so we’ve asked ONS and Home Office to publish a development plan to share publicly how and when the NDQIS tool will be rolled out more widely.  

Knife crime is a serious crime that often affects the young and disadvantaged and often has tragic consequences. It’s a high priority policy area for the UK Government, featuring prominently in the Government’s Beating crime plan. To fully understand and tackle the nature of the problem posed by knife crime, data collected by police forces must be of high quality and accurately reflect trends in this type of crime to inform public policy. 

The people behind the Office for Statistics Regulation in 2020

This year I’ve written 9 blogs, ranging from an exploration of data gaps to a celebration of the armchair epidemiologists. I was thinking of making it to double figures, setting out my reflections across a tumultuous year. And describing my pride in what the Office for Statistics Regulation team has delivered. But, as so often in OSR, the team is way ahead of me. They’ve pulled together their own year-end reflections into a short summary. Their pride in their work, and their commitment to the public good of statistics, really say far more than anything I could write; it’s just a much better summary.

So here it is (merry Christmas)

Ed Humpherson

Donna Livesey – Business Manager

2020 has been a hard year for everyone, with many very personally affected by the pandemic. Moving from a bustling office environment to living and working home alone had the potential to make for a pretty lonely existence, but I’ve been very lucky.

This year has only confirmed what a special group of people I work with in OSR. Everyone has been working very hard but we have taken time to support each other, to continue to work collaboratively to find creative solutions to new challenges, and to generously share our lives, be it our families or our menagerie of pets, all be it virtually.

I am so proud to work with a team that have such a passion for ensuring the public get the statistics and data they need to make sense of the world around them, while showing empathy for the pressures producers of statistics are under at this time.

We all know that the public will continue to look to us beyond the pandemic, as the independent regulator, to ensure statistics honestly and transparently answer the important questions about the longer term impacts on all aspects of our lives, and our childrens’ lives. I know we are all ready for that challenge, as we are all ready for that day when we can all get together in person.

 

Caroline Jones – Statistics Regulator, Health and Social Care Lead

2020 started off under lockdown, with the nation gripped by the COVID-19 pandemic and avidly perusing the daily number of deaths, number of tests, volume of hospitalisations and number of vaccines. This level of anxiety has pushed more people into contacting OSR to ask for better statistics, and it has been a privilege to work at the vanguard of the improvement to the statistics.

To manage the workload, the Health domain met daily with Mary (Deputy Director for Regulation) and Katy, who manages our casework, so we could coordinate the volume of health related casework we were getting in. We felt it important to deal sympathetically with statistic producers, who have been under immense pressure this year, to ensure they changed their outputs to ensure they were producing the best statistics possible. It’s been rewarding to be part of that improvement and change, but we still have a lot of work to do in 2021 to continue to advocate for better social and community care statistics.

 

Leah Skinner – Digital Communications Officer

As a communications professional who loves words, I very often stop and wonder how I ended up working in an environment with so many numbers. But if 2020 has taught me anything, it’s that the communication of those numbers, in a way that the public can understand, is crucial to make sure that the public have trust in statistics.

This has made me reflect on my own work, and I am more determined than ever to make our work, complex as it can be, as accessible and as understandable to our audiences as possible. For me, the highlight of this year has been watching our audience grow as we have improved our Twitter outputs and launched our own website. I really enjoy seeing people who have never reached out to us before contacting us to work with us, whether it be to do with Voluntary Application of the Code, or to highlight casework.

As truly awful as 2020 has been, it is clear now that the public are far more aware of how statistics affect our everyday lives, and this empowers us to ask more questions about the quality and trustworthiness of data and hold organisations to account when the data isn’t good enough.

 

Mark Pont – Assessment Programme Lead

For me, through the challenges of 2020, it’s been great to see the OSR team show itself as a supportive regulator. Of course we’ve made some strong interventions where these have been needed to champion the public good of statistics and data. But much of our influence comes through the support and challenge we offer to statistics producers.

We published some of our findings in the form of rapid regulatory review letters. However, much of our support and challenge was behind the scenes, which is just as valuable.

During the early days of the pandemic we had uncountable chats with teams across the statistical system as they wrestled with how to generate the important insights that many of us needed. All this in the absence of the usual long-standing data sources and while protecting often restricted and vulnerable workforces who were adapting to new ways of working. It was fantastic to walk through those exciting developments with statistical producers, seeing first-hand the rapid exploitation of new data sources.

2021 will still be challenging for many of us. Hopefully many aspects of life will start to return to something closer to what we were used to. But I think the statistical system, including us as regulators, will start 2021 from a much higher base than 2020 and I look forward to seeing many more exciting developments in the world of official statistics.

 

Emily Carless – Statistics Regulator, Children, Education and Skills Lead

2020 has been a challenging year for producers and users of children, education and skills statistics which has had a life changing impact on the people who the statistics are about.  We started the year polishing the report of our review of post-16 education and skills statistics and are finishing it polishing the report of our review of the approach to developing the statistical models designed for awarding grades.  These statistical models had a profound impact on young people’s lives and on public confidence in statistics and statistical models.

As in other domains, statistics have needed to be developed quickly to meet the need for data on the impact of the pandemic on children and the education system, and to inform decisions such as those around re-opening schools. The demand for statistics in this area continues to grow to ensure that the impact of the pandemic on this generation can be fully understood.

“Wouldn’t it be cool if…

…we could look at this against x! And y. And maybe a, b and c too…”

This felt like quite a common conversation with my team, back when I was analysing data in the Department for Digital, Culture, Media and Sport (DCMS) circa 2015.

The number of interesting questions and analyses we could do with our data, if we could only put it together with other data, felt potentially limitless. And what an amazing benefit these analyses could have to society – we’d basically be able to understand and improve everything!

But it wasn’t meant to be. We did try and match our survey data with data held by one other department and… it was painful! It took months to get to the point of being able to physically share and receive data and, once we had some data, getting it ready to analyse proved tricky too. In fact, it proved so difficult that, I’m ashamed to admit, I moved roles before I managed it.

OSR also continues to emphasise the power of linked data to produce better statistics. On paper, linking data sets might sound simple but, in practice, it is often difficult. This is why I’m so excited about the recent work we’ve seen from the Ministry of Justice (MoJ). MoJ is taking great steps to link up the administrative data sets it generates in its operational work, and to make them available for analysis by people outside of the department. This means that MoJ, and other interested parties, can more easily do analysis across different parts of the justice system, and beyond, to understand the journeys individuals take.

There are two projects I’d like to highlight:

         1. Data First

In collaboration with ADR UK (Administrative Data Research UK), MoJ is undertaking an ambitious data linkage project called ‘Data First’. OSR’s 2018 review of The Public Value of Justice Statistics highlighted the need for statistics that move from counting people as they interact with specific parts of the justice system to telling stories about the journeys people take. Data First is doing just that! It will anonymously link data from across the family, civil and criminal courts in England and Wales, enabling research on how the justice system is used and enhancing the evidence base to understand ‘what works’ to help tackle social and justice policy issues.

In June, we were delighted to hear that Data First reached its first major milestone. The first, research-ready dataset – a de-identified, case-level dataset on magistrates’ court use – was made available for accredited researchers through the Office for National Statistics (ONS) Secure Research Service (SRS). This data provides insight into the magistrates’ court user population, including the nature and extent of repeat users. It enables, for the first time, researchers to establish whether a defendant has entered the courts on more than one occasion and will drive better policy decisions to reduce frequent use of the courts. In August, a second output followed, this time a de-identified, research-ready dataset on Crown Court use. This dataset is also available through the SRS.

         2. Data shares with the Department for Education (DfE)

To improve understanding of the potential links between individual’s educational outcomes and characteristics and their involvement or risk of involvement with crime and the criminal justice system, MoJ and DfE have created a de-identified, individual-level dataset, which links data from the Police National Computer (MoJ) and the National Pupil Database (DfE)[1]. The DfE data spans educational attainment, absence from school, exclusions and characteristics like special educational needs and free school meals eligibility. The MoJ data includes information on criminal histories and reoffending, court proceedings, prison and assessments of offenders. Linking this data will allow analysis that has previously not been possible, including: longitudinal analysis of trends in individual’s characteristics and outcomes; analysis to inform the design of policies and processes that better support those at risk; and evaluations of the effectiveness of interventions. Accredited researchers can apply to access the data via the ONS SRS or MoJ’s Justice MicroData Lab.

This work follows The Children in Family Justice Data Share (CFJDS)[2], which started in 2012 and has resulted in a database of child-level data linked from across the MoJ, DfE and the Children and Family Court Advisory and Support Service (Cafcass). The CFJDS provides, for the first time, longitudinal data on the short and medium-term outcomes for children who experience the family justice system. The data are being used to build understanding of how different experiences and decisions made within the family court can impact on children’s educational outcomes, and subsequently, their life chances. In turn, they will provide more robust evidence on which to make policy decisions for children and their families.

What’s really exciting about both these projects is the way that the teams involved are tackling the challenges of data linkage. Instead of creating a big new IT system to try and join up the data, these projects are starting from a position of, “let’s take what’s in the current databases and see what we can get through anonymised matching.” The exact tools used vary between teams and departments but include established tools such as SAS Data Management Studio and SQL Server Management Studio (SSMS), which were used by MoJ and DfE respectively for linking crime and justice and NPD data. For data linkage done as part of Data First, MoJ have developed a new tool called Splink, which was written in the programming language Python. Splink is an open source library for probabilistic record linkage at scale: it’s free, and MoJ hope others in government (and beyond) will find it useful for their own data linkage and deduplication tasks. Rule based matching algorithms, including ‘fuzzy-matching’ algorithms – rules used to link data based on non-perfect matches between data variables – have been used to link individuals within and between data sets.

These projects show what can be achieved when government departments, agencies and external organisations work together, and will help us start to achieve what my team and I hoped we could back in 2015. They will enable us to better understand individuals and society and, in turn, to make better decisions and policies, which will improve the justice system and outcomes for all individuals. I’m looking forward to seeing what comes next.

 

[1] To ensure the confidentiality and protection of data about children, access to DfE data extracts from the NPD is managed through tightly controlled processes.

[2] https://www.gov.uk/government/statistics/family-court-statistics-quarterly-october-to-december-2017, published 29 March 2018