Opinion

Do graduates and ratings really make no difference?

Following a study claiming that graduate staff and Ofsted Outstandings have little impact on children’s outcomes, a group of early years academics take issue with its methods

Comparing evidence

By Kathy Sylva (pictured), Pam Sammons, Iram Siraj, Brenda Taggart, Sandra Mathers, Edward Melhuish – University of Oxford Department of Education, and University College London Institute of Education

In February, the London School of Economics (LSE) published an investigation into the effects of pre-school on children’s development (Quality in Early Years Settings and Children’s School Achievement). It contradicts many studies that show pre-school quality and early experiences have significant positive effects on outcomes, and has been seized on by the media and those seeking to keep down staffing costs.

Like many, we were surprised by its counter-intuitive conclusions. We have identified serious limitations in conceptualisation and research design/methods, which mean the paper’s conclusions and apparent policy implications should be treated with extreme caution. This said, we welcome opportunities to highlight the ways that variation in methods may lead to differences in ‘conclusions’.

RESEARCH METHODS

The paper combines data from The English National Pupil Database and the Early Years’ Census, and analyses the information on about 1.8 million children born between September 2003 and August 2006. This large number is the paper’s unique selling point, but it comes at the price of weak, and in our judgement, flawed measurement – especially in comparison with other research designed to study the fine-grained effects of pre-school, in which ‘process’ quality is evaluated via detailed observational instruments linked with children’s development, precisely measured through age-appropriate psychometric tests.

The LSE uses data to link the development of children (aged five and seven) with the quality of their pre-schools, measured only by the child’s access to a degree-level teacher and by Ofsted inspection ratings.

The paper’s main outcome measure is the child’s total score on the teacher-assessed EYFS Profile (EYFSP) at the end of Reception. In the period relevant to this study, the EYFSP comprised 1-9 ratings against differentassessment scales:communication; language and literacy; numeracy; personal, social/emotional development; knowledge/understanding of the world; physical and creative development. The LSE paper simply uses the broadbrush total score rather than focusing, like others, on the fine detail of distinct outcomes.

The paper uses information in the PVI census to estimate the qualified staff/child ratio, assuming that all children in the maintained sector have a qualified teacher.

The LSE uses Ofsted early years inspection data (2005-2011), matching the child and their setting to judgments closest to the time they attended – even though the inspection may have taken place six years before or one year after the child left. This is muddied by the inspection changes in 2008. In 2005-2008, PVI settings were inspected on different criteria from the maintained sector, but in 2008-2011, all providers had the same inspection with a single overall judgment.

THE PAPER’S MAJOR LIMITATIONS

• It has weak outcome measures. Despite attempts to ensure consistency, problems are inevitable when considering assessments made by thousands of teachers. Other research mitigates this by using a small number of researchers using stringent assessments (e.g. British Ability Scales). Because EYFSP is conducted at the end ofReception, it measures both the effect of pre-school as well as Reception. It therefore cannot isolate the effect of the child’s time in pre-school alone. Other studies tackle this by collecting outcome measures at the end of pre-school or immediately on entry to Reception. This avoids confounding pre-school and school influences.

• With no pre-school entry baseline, the pre-school ‘value added’ cannot be calculated. This is crucial because children, by age three, already have different developmental levels. The paper only considers a child’s profile at the end of Reception, not their developmental characteristics when entering the system. This is a major weakness: other studies avoid this by taking several pre-school baseline measures, e.g. language.

• The Ofsted measures of quality have the potential to be up to six years adrift. Given this time lapse and interim changes, Ofsted ratings may inaccurately reflect quality. Other studies have used more differentiated measures (e.g. the Early Childhood Environment Rating Scales (ECERS) taken when the children in the statistical analysis were present in the settings. Ofsted judgments have a low correlation with observational measures (e.g. ECERS R-E), with English research showing consistent poor relationship between Ofsted ratings and children’s development.

• For decades, research has shown that family background has the largest influence on children’s development. The paper uses only two coarse measures for this: free school meals and neighbourhood disadvantage via post codes. Other research has collected parents’ qualifications, salary, family size, etc, and the home learning environment (HLE). These studies take into account the effect of background and the HLE before estimating the effects of pre-school quality. These studies have a stronger control for parental support, vital in influencing outcomes later on. These designs are therefore better at demonstrating pre-school effects ‘net’ of background.

• The paper uses inappropriate statistical approaches to identify school and pre-school effects which ignore the clustered nature of the data at each level.

SUMMARY

Given these serious limitations, it is unsurprising that quality has only a small effect. The authors acknowledge ‘substantial differences’ in pre-schools in their apparent effect on development with a large gap between the outcomes of settings in the top and bottom quartiles. They conclude that their predictors (teacher presence, Ofsted judgments) may not be the best ones to explain why some settings boost children’s development while others do not.

We conclude that the limitations of the data and design mean that the authors are unable to provide reliable results on pre-school effects and therefore underestimate the effects of pre-school quality and staffing. We believe there are grave risks in using this evidence for policy-making. A profound understanding of the effects of quality in early childhood requires precise/differentiated measures at several time points, coupled with strong control for social background and observations of the pedagogical strategies experienced by young children in their settings.