Home | School Management | “A lot of catastrophising:” Standardised tests don’t show decline

“A lot of catastrophising:” Standardised tests don’t show decline

A University of New England researcher has collated 25 years' worth of Australian students' test results and found there has been no significant decline in three out of the four major tests students take.

Since the year 2000, when Australia began participating in PISA tests (which are held every three years), students' reading and maths literacy results have steadily declined.

This is the same story for science literacy, with the exception of 2022's results, which displayed a slight uptick in science knowledge.

However, the latest Programme for International Student Assessment (PISA) results placed Australian students fourth best in the world for creative thinking skills behind only Singapore, South Korea and Canada out of 81 countries.

The Organisation for Economic Co-operation and Development (OECD), which administers the test, said in a statement that Australian students performed "better than expected" after its previous poor results in mathematics and reading.

Expert in educational assessment Dr Sally Larsen found that the narrative of student's decline in their ability to attain critical academic skills is not supported by large-scale assessment data from the National Assessment Program – Literacy and Numeracy (NAPLAN), Progress in International Reading Literacy Study (PIRLS) and Trends in International Mathematics and Science Study (TIMSS) tests.

“Ask almost anyone how Australian students are going in tests of basic skills, and the perception is that results are getting progressively worse,” Dr Larsen said.

“In reality, student achievement is only declining in the PISA assessments.”

PISA tests how students apply knowledge in many areas of learning, and does not specifically test the content in the Australian curriculum, unlike the other three tests.

Australian PISA results for maths, reading and science since 2000. (click to enlarge). Supplied: Sally Larsen

The 2022 data, which was delayed one year due to the Covid-19 pandemic, says our students got a score of 487 in maths, 498 in reading and 507 in science.

Australian students actually performed above the OECD average in maths, reading and science in 2022 (472, 476 and 485 respectively). The proficient benchmark for Australian students in PISA tests is Level 3.

Dr Larsen's research also found that, overall, Australian students have made progress in literacy and numeracy in NAPLAN since the start of the tests in 2008.

Time series showing NAPLAN mean test scores for Years 3, 5, 7 and 9 at all calendar years assessed (click to enlarge). Note: NAPLAN tests were not undertaken in 2020. Supplied: Sally Larsen

“Across the three other standardised tests ... what we can see in the data is improvement in results for primary school-aged children, and no real changes in average results for secondary students over 25 years.

Senior Lecturer in UNE's School of Education Dr Sally Larsen. Picture: Supplied

“If we’re serious about making improvements to the outcomes measured by standardised tests, we have to have an accurate and unbiased understanding of all assessment data, and not jump to conclusions based on only one test.

“Policymakers need to be careful about making recommendations for teaching practices to change that are based solely on selective reporting of standardised assessment results. A more complete understanding of progress in student achievement is possible if the results of all major assessments are considered together.”

Why do we hear about declining results in Australian students?

Reports that only about half of students are proficient in reading and maths, and 40 per cent in science are somewhat inaccurate, Dr Larsen told Education Review.

“Standardised assessments can also be far removed from what students are learning in the classroom, and are often measuring something very different to in-school assessment. This seems particularly true of the PISA tests," she said.

"The fact is, it's quite difficult to interpret educational assessment data, particularly when it is collected in the population, or in large, representative samples.

"It is difficult to know when a change in averages can be considered large, or meaningful. There will be change in the average each time students sit these exams simply due to population variability – a natural phenomenon whereby the average goes up or down marginally. This sort of change should not be over-interpreted."

She said test developers changing how students are marked and ranked, as they often do with no ill intent, also changes how we perceive student's results.

"It is entirely possible for the population achievement to be on average with the OECD and to have some students in the lower proficiency bands and other students in the upper proficiency bands," Dr Larsen explained.

"In Year 9 NAPLAN Reading in 2022, about 10 per cent of students did not meet the minimum standard in that year.

"Interestingly, the system of using bands to represent levels of student achievement was changed in 2023, meaning that students are now grouped into four categories rather than six. Because there are fewer categories, there are now more students in the bottom two proficiency levels.

"Does all of this mean that Year 9 students’ reading proficiency has precipitously declined between 2022 and 2023? Not at all! It just means that the test developers changed the categorisation of groups of students."

Dr Larsen said disaster narratives make good media stories and that governments can use 'declining' standardised test results to mandate that explicit teaching should be used in the classroom at all times.

"This has allowed a lot of catastrophising about percentages of students in the bottom two proficiency categories, and not a lot of thought about what these categorisations mean and whether they are a useful way of interpreting students’ reading achievement and ability," she said.

"Expectation effects and cognitive biases must play a role in these types of public narratives. If we expect worse and worse student performance over time we’re more likely to notice results that align with our expectations or preconceived opinions, and less likely to notice results that don’t fit what we expect."

Many teachers and education researchers have criticised policy makers and curriculum developers for walking away from whole language in reading, and for pushing explicit teaching and instruction in most areas of schooling.

"I think teachers are right to be hesitant or sceptical about these types of mandates. If the theory I propose in the paper turns out to be correct, a greater focus on explicit teaching and basic skills will result in continued decline of PISA results, rather than improvements. [This is] because PISA assesses kids abilities to think creatively and apply knowledge, and there will be fewer opportunities in classrooms to practice these types of skills if all teachers are mandated to explicitly teach content the whole time," she said.

"On the other hand, NAPLAN will likely continue the same, or improve, but the focus won’t be on this assessment because everyone will see the poor results in PISA and carry on catastrophising about that assessment in isolation."

Dr Larsen said curriculum makers should be wary about the impact this move would have, not only in test performance, but also in general learning.

"For example, in secondary English, the aim is not to explicitly teach students ‘the answer’ – and indeed English subject teachers work hard to develop their students ability to think for themselves, and not simply repeat back ‘the answer’ that the teacher has given them," she said.

"If subject English moves more towards explicit literacy instruction, rather than the study of texts and ideas, then it’s entirely possible that students will continue to achieve poorly in PISA.

"If the focus of instruction in schools is on basic skills, then that is what you get out at the other end."

Do you have an idea for a story?
Email [email protected]


  1. Edmund Esterbauer

    Visual learning is currently devalued using explicit teaching approaches. This is due to statistical discrepancies of the effect size resulting from limitations to Cohen’s d which standardises and distorts sample sizes by giving samples equal weights despite big differences in sample sizes. The effect size is understated.

    Humans are visual learners and there are many neurological studies which emphasise the role of brain’s visual cortex and its evolution adaptations to interpret signals from the environment. People are ‘hard-wired’ for visual learning.

  2. The PISA testing is inherently flawed, which children are chosen to be tested? What time of day are they tested? How much testing burn-out are they experiencing at that actual time? Do they care about the performing of the test or is it simply something they have to do? What are the children’s cultural backgrounds? So many aspects of the psychological are not taken into consideration. There needs to be a re-examination of the idea of testing itself and if there can actually be a test that compare

Leave a Comment

Your email address will not be published. Required fields are marked *