Sats boycott: how were national results affected?

Monday, July 26^th

Most journalists –I like to think, at least – try to probe beneath the surface a little bit in seeking to get some insight into what is going on. Sometimes, this process can be frustrating, especially when evidence comes in which does not fit a pre-conception, hypothesis…or strong news line.

Late last week, I had some experience of this when doing some number-crunching on this year’s national test results.

I wanted to find out if the thousands of schools boycotting this year’s tests – some 26 per cent of the total – would have any effect on the national data generated at the end of the process.

My thinking went along the following lines. I knew that strenuous efforts were made by the testing authorities – principally, the Qualifications and Curriculum Development Agency (QCDA) – to hold the standard of the tests constant every year.

That is: the QCDA oversees a complex process, including “pre-testing”, whereby it is supposed to ensure that it is just as easy – or difficult – for a child of a certain ability and understanding of their subject to achieve a certain level one year, as it would be the next.

When entire national cohorts of 11-year-olds take the tests every year, if the difficulty* of the tests is held constant, then the numbers passing them should give a good idea of the overall national standard of understanding of that particular cohort, in the tested parts of each subject at least, it is argued.

However, what if, in the case of a boycott, the average ability** levels of pupils taking the tests in one year changed? What if pupils from the boycotting schools tended to come overwhelmingly from those who would have been expected to achieve the Government’s target level very easily? In that case, assuming there was no major change in the overall national ability profile compared to the previous year, the sample of pupils taking the tests would have been skewed, because the number of high-achieving pupils taking the tests would have been reduced because many would have been part of the boycott.

If the tests were just as hard as in previous years, results would then fall because of the number of likely high achievers taken out of the test-taking “sample”. But this would say more about the effect of the boycott than it would reflect any fall in national standards overall.

Conversely, if the boycotting schools tended to have a pupil population of lower-than-average ability in the tested subjects, then results might rise as the proportion of high-ability youngsters taking the tests went up.

A couple of weeks ago, after getting hold of the list of schools which boycotted the tests, I began to wonder if the second scenario might hold true. I wrote an analysis piece which ranked the regions of England on take-up for the boycott.

Although finding precise patterns was tricky, it did seem that regions with higher proportions of pupils eligible for free school meals tended to have higher support for the boycott, while those with lower free school meal eligibility tended to have higher numbers of pupils taking Sats as normal.

With pupils eligible for free school meals established, by a great deal of research, as less likely to do well in tests, I thought this might translate to mean that the boycott disproportionately took out from the test-taking sample children who might have been expected, on average, not to do well in the tests.

If this were the case, I thought, even if the results of those taking the tests rose, this might say less about national standards rising, and more about the effects of the boycott.

But this was only a hypothesis. Fortunately, it was testable, if only in a rough way. I got hold of the national test results for 2009 of all primary-age schools in England. I then checked to see if the schools which boycotted the tests this year had higher-than-average results in 2009, or lower-than-average.

And the answer? Well, to my surprise their pupils’ performance, at least in terms of the numbers achieving the Government’s “expected” level four, was almost spot on the national average last year by my calculations.

Nationally, 80 per cent of pupils achieved level four in English, 79 per cent in maths and 88 per cent in science in 2009. Among the schools which boycotted the tests this year, the respective percentages were 79.6, 79.2 and 88.5. The average points score, which the Government calculates by simply adding up the percentage of level fours across all three subjects in each school, was 247 nationally last year and 247.4 in the 2010 boycott schools, I found.

So, there I was expecting to write about how the sample had been skewed massively by the effect of the boycott, and how we should be able to read even less into the national scores than ever this year. Yet the numbers appear not to support such a conclusion.

That said, this may just be luck on the part of the Government. I have checked with the QCDA, whose answer suggests to me that it did not investigate the characteristics of pupils in schools boycotting the tests this year.

It said that its job was to ensure the tests were kept at a constant difficulty level each year and that this process was not affected by the industrial action.

In other words, if for any reason only a proportion of the cohort took the tests in any year and this affected what could be read into the results as a guide to overall national standards, this was not a matter for the QCDA because it was concerned with the difficulty of the tests themselves, rather than the nature of the cohort taking them.

I find that answer strengthening my belief in the merits of national sample tests/assessments as measures of overall standards for England as a whole. Under this system, by design every year the testing authorities would select only a proportion of pupils to take tests, but they would do so in such a way as to ensure it was nationally representative. In other words, they would have to control the sample of pupils taking the tests each year to ensure the results could be read as standing for England as a whole.

By contrast, if the Government sticks with the current testing model, any future boycott action could have an effect on what could be read into the results as guides to national standards.

I think this is more evidence that the problem with the current system is that there are multiple purposes for these tests: they are not designed solely as a check on national standards, of course, but to check on schools, teachers and pupils, as well as providing evidence for many other aspects of education. So the QCDA seeks to ensure that the tests are comparable to previous years for pupils and schools, but sees it as not its job to investigate the possibility of any effect on national standards caused by the sample of pupils taking the tests changing.

The Department for Education told me: “Results from the 2010 Key Stage 2 National Curriculum Tests will be published on 3 August. This publication will include any relevant commentary about the impact of industrial action on the results, including issues of representativeness.”

That looks to me as if there will be some disclaimer offered by the Government next week about being cautious about how much can be read into this year’s results as indicators of national standards, because of the boycott. My number-crunching suggests one should be cautious, but perhaps not as cautious as I thought before looking a bit more closely.

*By “difficulty”, I mean not just how hard the questions are, but how easy or hard it is for a pupil of a given ability to achieve a certain level on a paper each year.

** I am using the word “ability” loosely here, to mean the overall likely capability of an 11-year-old to do well in the national tests. I realise that there are debates over whether ability is fixed from a young age, or, of course, whether test-taking ability reflects overall understanding of a subject.

The Tyranny of Testing

Sats boycott: how were national results affected?

Leave a Reply Cancel reply