August 29, 2020
Brian Lightman is a School Leadership Consultant and former General Secretary of the Association of School and College Leaders. In this article, republished from Brian’s blog at https://brianlightman.wordpress.com/2020/08/21/how-did-we-get-here-and-what-next/, he explains how meaningful assessment has been undermined since 1988 and makes his own suggestions to address the needs of the 2020/21 student cohort.
The top priority recently has been rightly to address the needs of the 2019-20 cohort following the examinations crisis of the recent summer.
At the same time we cannot afford to lose sight of the needs of the 20/21 cohort of students. As with other effects of the pandemic (about which I have written elsewhere)1 it will be essential to step back and assess what needs to be done for them.
In the growing likelihood of further disruption to the forthcoming school year it appears that ministers continue to see exams as the only valid form of assessment. The only proposals that have come forward appear to be looking at little more than tweaking at the edges of our current system. Failure to address this will set our schools and colleges up for another disaster next year.
In order to understand the perfect storm surrounding this year’s examination results and explore ways forward it is necessary to consider how the current situation arose. Contrary to popular views, the pandemic has been a symptom rather than the cause. The cancellation of exams has brought the shortcomings of a dysfunctional system into sharp focus.
Let us explore this in more detail:
For some decades the assessment of learning in English schools has been undermined for the purposes of external accountability. This can be traced back to as far as the 1988 Education Reform Act intensifying at various stages since then.
Here are some of the steps of that process:
- With the introduction of open enrolment, local management and stronger governance schools were incentivised to recruit as many pupils as they could and were publicly held to account about their outcomes. Few would argue that increased accountability was unnecessary. The problems that followed arose largely from the unintended consequences or perverse incentives of policy decisions that followed.
- Performance tables presented further levers to incentivise schools to focus on particular grade thresholds such as the C-D borderline at GCSE. These incentives led to the creation of qualifications which made the achievement of those grades easier to achieve. High profile celebration and praise by successive governments singled out schools which achieved high results in such qualifications. There were examples of schools with superficially stellar GCSE results whose students were completely unprepared for further study. Schools that resisted such qualifications found themselves penalised in performance tables and sometimes in inspections which correlated closely to those indicators.
- This led to the discourse of ‘gaming’ and ‘grade inflation’. There were undoubtedly schools that chose particular qualifications in order to enhance their position in performance tables rather than for sound educational reasons. This should not have happened but the lever was not one they designed. That discourse deflected the blame away from government at a time when a culture of denigration pervaded the language of policymakers. Talk of ‘failing schools’, ‘enemies of promise’ etc. together with the consignment of many experts to the ‘blob’, the dissolution of bodies like the ‘Qualifications and Curriculum Authority’ and the asserted need to restore knowledge into a curriculum allegedly designed solely around skills reinforced the message that rigour was being reintroduced. Ironically the late decision by government this year to award the higher of the CAG or algorithm grade has led to some seriously inflated grades which schools would not have recommended and students could not have accessed if the examinations had taken place.
- There is an elephant in the room here. A considerable number of members of the teaching profession have been complicit in the discourse I have described.. Some have participated in a polarised characterisation of teachers as ‘traditionalists and progressives’, others have allowed the unethical practice of a minority to generalise accusations of ‘gaming the system’ and that minority have let down the rest of their profession. Much of this has been a function of the levers pulled by the accountability system including the vulnerability of senior leaders jobs. Both those working in our education service and those in government must rise up to the challenge of moving on from this toxic culture.
- From 2010 the coalition government promised the reintroduction of ‘academic rigour’. Though there was no question that rigour had been lost from some courses which needed to be reformed an assumption was made that academic rigour can only be assessed by examinations, that coursework and modular courses were the cause of the problem and needed to be eradicated rather than reformed. Teachers could not be trusted to assess their students accurately. The fact that just about every university course relies on such approaches and that many of the highest achieving countries in the world rely on them was quietly ignored.
- The reformed exams therefore relied entirely on assessment through final exams. Whereas previously assessment had taken place throughout the course everything depended on the summer examinations. If this had not been the case we would have had a much more reliable basis for judgements at the end of the course in 2020. A Level students who have ended up with ungraded results and nothing to show for two years of work have been left particularly disadvantaged.
- In order to establish standards for the new examinations which were comparable with the achievement of previous cohorts the ‘comparable outcomes’ system was put in place2. Legislation had been passed in 2009 under Labour3 to ensure that standards were maintained between years and the comparable outcomes methodology was implemented from 2010 for new GCSEs with a ‘standards advisory group’ formed by Ofqual to oversee this from 2012 and beyond.
- The basic premise is that a grade achieved in one year would have the same level of currency as one achieved the year before4. The approximate profile of grades would only change if there was evidence of system wide improvement preventing ‘grade inflation’ from taking place. It is interesting to reflect on that term. Whereas in England a rise in higher grades is referred to as grade inflation, other countries refer to it as improvement. To date no effective method has been found of assessing that and no change has been made to proportions of grades since the reformed exams were introduced even though a National Reference Test was introduced to this end. Each year we are told that standards have been maintained because across the system the overall percentages at key grades have been manipulated to remain stable.
- The flaws in this system are twofold: first the method for reaching this is a statistical one. It is not about the knowledge and understanding students have gained; the examinations are not criterion referenced. This means that a grade 9 in a subject does not tell a prospective employer or post 16 tutor what that means in terms of Mathematical knowledge and understanding but simply that the student was in the top X% of the cohort. An equally brilliant student in a stronger cohort may not have gained the same grade. We know that the reliability of exam marking is weak5. Appeals over recent years have meant that even when marks were upgraded the statistical process did not change grades. Second, that statistical process is largely predicted on ‘prior attainment’. For GCSE this means key stage 2 tests – eg a test sat on one day in two subjects when the pupil was 10. Anyone who has worked in a secondary school knows how students can change over the five years of schooling. Most significantly an inbuilt characteristic of this system is that a given percentage referred to in a seminal report published by ASCL as the ‘Forgotten Third’6 will always fail to reach the required pass mark of a grade 4. The experience of the algorithm this year has brought this somewhat esoteric process into sharp focus and raised a level of public awareness which had not existed hitherto.
What appears to be holding ministers back from creating an alternative long-term plan for assessment is that such a plan would have some radical elements including a major shift of culture, trust and the locus of accountability. It would require brave decisions by ministers and return major professional responsibilities onto the teaching profession. Instead of assuming the opposite they could and should rightly expect the profession to rise to the challenge with the support of high-quality professional development and courageous leadership by a united education community working in true collaboration with policymakers. That is the true moral imperative facing the teaching profession and government. The Chartered College and Ofsted could play a major part in this.
Though a long-term plan is needed to do justice to this major task, the experience of this year has demonstrated why we cannot wait and do nothing. The arrangements made this year are not sustainable in the future. At the time of writing we do not know whether it will be practical to run examinations in 2020/21 or the extent to which schools will be open all year for all students. Nor is it realistic to look backwards and expect to revert to the previous arrangements or reintroduce another algorithm,
As we must make a start what might a stepping stone look like for this year’s students? Here are some initial suggestions. I would propose this whether or not examinations take place next summer.
- Schools embed regular low stakes assessment into all teaching building up a picture of what students know and understand. The creation and implementation of appropriate assessments needs to be a major priority for professional learning in all schools.
- At regular points in the year schools set formal tests allowing them to place their students into a rank order and collate areas of relative strength and weakness in order to plan the next stages of teaching.
- The above could be supported by some standardised assessment tasks that enable schools to benchmark the knowledge and understanding of their students against an agreed standard. It is most important that the design of these tasks would not be about jumping through hoops of answering a particular question but a genuine assessment of their mastery of the content against clearly specified criteria. This would begin to return us to a situation when a clearer picture of what our students know, understand and can do began to emerge.
- At an appropriate stage of the year all of the above data could be collated into a summative and evidence based professional judgement about the overall attainment of the students.
- This should be moderated within school by a rigorous process involving all members of a subject department and senior leaders all of who should have a strong shared understanding of the department’s strengths and weaknesses and knowledge of their classroom practice.
- There could be a degree of external verification conducted by moderators who could be Chartered Assessors with expert knowledge of assessment. Their job would be to validate the school’s processes and help to achieve some sort of consistency across the system. Where a school’s grades appeared out of kilter with previous performance this would be scrutinised This is where the accountability could sit. There is a cost to this. I would strongly advocate that government moves rapidly to put this in place.
- At the end of this process a Centre Assessed Grade would be generated. Whether or not examinations take place grades produced would have a far greater degree of reliability when underpinned by a well-planned process of this kind.
It is worth noting that none of the above is rocket science. Most of it reflects existing best practice. In order for it to be effective there is no time to be lost. This approach has to start in schools from the beginning of the Autumn term alongside planning and detailed consultation with teachers and school leaders in the DFE for the standardisation procedures.
It would make no sense to compare the profile of the 2020-1 grades with those of previous years or to place artificial constraints on the proportion of students achieving each grade. Much has been written about the risks arising from that process. Even before the pandemic the cracks were evident in a system built on retaining comparable percentages of grades from year to year. We really need to start with a blank sheet so that schools are not constrained by a strait jacket or postcode lottery and every teacher and school leader has to rise to this challenge.
I don’t expect every reader of this to agree with my analysis or that my proposals are a perfect solution. What I hope however is to promote some discussion in the fervent hope that we do a better service to our young people during the next academic year.
 It is worth noting that this comparability related to the overall statistics rather than comparing the relative difficulty of individual subjects. This has led for example to a lengthy debate about the ‘severe grading’ of modern foreign languages
 A vast amount of research has been undertaken about this. Here is a good summary of the issues dating back to 1996 https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/604938/0596_Wilmut_et_al_A_review_of_research_into_the_reliability_of_examinations.pdf