We are concerned about a Department for Education evidence document, “The case for a fully trust-led system”, that was published alongside the White Paper, “Opportunity for all: strong schools with great teachers for your child”.
We believe this document has several serious statistical flaws in it and that taken as a whole the document is misleading to the public.
In the School Performance section of the document, Tables 5 and 6, report the Ofsted rating for schools by governance. However, they show the current governance of the schools and not the governance of the schools when the inspections were carried out. 4,432 maintained schools have converted to academy status since they were last inspected and so this table misrepresents the ratings of schools by governance.
In the section High Quality and Inclusive Education, Tables 7, 8, 9 and 10 report the proportion of pupils reaching the expected standard in reading, writing and maths and the Progress 8 scores for pupils by school governance and then ranks them. These tables do not have a methodology. In addition, there is no discussion of the limitations of this approach. The Department have used these tables to make a series of claims,
“Despite this, the best MATs transform outcomes for pupils, particularly the most disadvantaged, and deliver improvement in schools and areas where poor performance had become entrenched. If all pupils did as well in reading, writing and maths at key stage 2 in 2019 as pupils in the MAT performing at the 75th percentile of MATs, national performance would have been 8 percentage points higher at 73%. At the 90th percentile this would have been 79%.
“For disadvantaged pupils, the increases would have been even greater – 10 percentage points at the 75th percentile and 19 percentage points at the 90th. The strongest trusts are relentlessly focused on using their expertise and resources to cater to the needs of all pupils, especially disadvantaged children or children with SEND.
“In secondary schools, we see a similar pattern. The top 10% of MATs outperform the highest performing LAs by 0.2 Progress 8 score. For disadvantaged pupils, the pattern is repeated, with a lower absolute but larger relative performance advantage.”
The tables do not report the average number of pupils in LAs, MATs and SATs. The LAs are much larger than the MATs and SATs, which results in greater variation, so this comparison is misleading, and it is used to make claims about the effectiveness of the MATs as compared with LAs.
Further, the tables do not report the proportion of pupils eligible for the pupil premium. This varies, the “best MATs” have a lower proportion of pupils eligible for pupil premium. We would be surprised if the statisticians responsible for making these tables had not pointed this out.
Also, the tables do not report the proportion of pupils attending grammar schools. The “best” MATs and SATs have more pupils attending them than the LAs. This will have a significant impact upon the pupils’ Progress 8 scores.
There is variation in other factors which have a smaller impact on the comparability of Progress 8 results such as the proportion of girls, the proportion of pupils with English as a Second Language, the proportion of pupils with special needs. The premise of the report is that the characteristics of the “best” MATs are replicable, if the much higher proportion of girls in these MATs were known then perhaps it would have helped explain how likely it is that this policy will work.
Finally, the paper argues, “the best MATs transform outcomes for pupils, particularly the most disadvantaged”, what is not said is that the “best” MATs for all pupils is not the same group as the group for disadvantaged pupils and readers are led to believe that they are the same group.
We attempted to recreate these tables from the data sources listed in the footnote. We were not able to exactly recreate the tables. This is probably because we excluded schools with any suppressed data but could be because the DfE used the current governance of schools rather than their governance in 2019 when the tests were carried out. We have attached our analysis with this email.
We believe this data release has failed to meet the standards for trustworthiness and quality set out in the Code of Practice for Statistics:
T1.4 Statistics, data and explanatory material should be presented impartially and objectively.
T3.1 The release of both regular and ad hoc official statistics should be pre-announced through a 12-month release calendar, giving a specific release date at least four weeks in advance where practicable.
T3.7 The name and contact information of the lead statistician or analyst responsible for production should be included in the published statistics.
T3.8 Policy, press or ministerial statements referring to regular or ad hoc official statistics should be issued separately from, and contain a prominent link to, the source statistics. The statements should meet basic professional standards of statistical presentation, including accuracy, clarity and impartiality. The lead statistician or analyst should advise on the appropriate use of the statistics within these statements.
T4.1 Organisations should be transparent about their approach to public engagement with users, potential users, and other stakeholders with an interest in the public good served by the statistics.
Q1.4 Source data should be coherent across different levels of aggregation, consistent over time, and comparable between geographical areas, whenever possible.
Q1.5 The nature of data sources used, and how and why they were selected, should be explained. Potential bias, uncertainty and possible distortive effects in the source data should be identified and the extent of any impact on the statistics should be clearly reported.
Q2.1 Methods and processes should be based on national or international good practice, scientific principles, or established professional consensus.
Q2.4 Relevant limitations arising from the methods and their application, including bias and uncertainty, should be identified and explained to users. An indication of their likely scale and the steps taken to reduce their impact on the statistics should be included in the explanation.
Joint General Secretary