An introduction to educational research methods. Введение в образовательные исследовательские методы Білім беру-зерттеу әдістеріне кіріспе

жүктеу/скачать 32,4 Mb.

Pdf көрінісі

бет	50/85
Дата	06.03.2017
өлшемі	32,4 Mb.
	#8078

1 ... 46 47 48 49 50 51 52 53 ... 85

LOOKING AT OTHER PEOPLE’S DATA
Activity 9.1 (Continued)

Non-experimental

In this approach, you don’t do your own experiment; you can’t control any variables. You

just look at what’s already happened and try to understand it. This means that you take

the variable showing an effect (the dependent variable) and try to find the variable(s) (the

independent variable(s)) which could have a causative relationship with it.

This approach is very common in education, and is also useful when ethics preclude

an experimental approach. For example, it would not be ethical to run an experiment

deliberately subjecting pupils to something detrimental to their learning, but studying

the outcomes of such a situation (which already exists) would be perfectly ethical. Two

examples may help make things clear.

At a particular school, the governors suddenly realized that the proportion of pupils

gaining five GCSEs at A–C (‘5 A–C’) has been dropping over the past 15 years. At their

meeting, they came up with a number of variables which could be responsible. They find

that as the number of teachers at the school went down each year, so did ‘5 A–C’ (we can

say that ‘5 A–C’ correlates significantly with the number of teachers at the school – you’ll

learn more about correlations in Chapter 12). One of them suggests an explanation – the

presence of fewer teachers leads to less variety of ideas for teaching and learning, and

hence the drop in attainment.

Taking A Quantitative Approach

433

But was the change in teacher numbers the cause of the change in ‘5 A–C’? Look at Figure

11.1 Actually, it may be the other way around (reverse causation). For example, the ‘5 A–C’

may have dropped one year, and encouraged parents who are interested in academic suc-

cess to send their children elsewhere. With fewer pupils, the numbers on roll go down and

the same number of teachers is no longer required.

Let’s look at a different example, which tries to explain observed differences between

two groups. The head teacher at school A knows that different tutor groups have different

levels of truancy. He uses a statistical test to confirm that tutor groups led by women have

less truancy than those led by men. Is he justified in concluding that female form tutors are

better?

Not necessarily. At this school, the extent of truancy happens to vary with age (older chil-

dren truant more), the Year 9 tutor team is composed mainly of women and the Year 11

tutor team is composed mainly of men. Hence, we’ve a third variable – year group – which

affects both truancy levels and the sex of the tutor (Figure 11.2), creating a plausibly causal,

but in fact artifactual, relationship between a tutor’s sex and truancy levels.

So it is important to realize that finding a relationship does not mean that it is a causal

relationship, and that causation could happen in either direction. In fact, sometimes a rela-

tionship may be pure coincidence – there is a proven relationship between stork popula-

tion size and human birth rate!

To some extent, you can control the effect of potentially confounding variables by how

you choose your sample, and by what you measure. For example, the head teacher above

could do the following:

• Randomly choose, and compare, equal numbers of male- and female-led tutor groups

from each year group. This does reduce the sample size, but removes the effect of age.

• Compare multiple pairs of tutor groups, matched as closely as possible by the chil-

dren’s socio-economic status and other potentially relevant variables, with the only

difference being the tutors’ sex.

• Decide which variables are likely to affect truancy levels, and measure them at the

same time as collecting the rest of the data. Some clever statistics (beyond the scope

of this book) can remove the effect of the extra variables, and ‘uncover’ any effect of

the tutor’s sex.

TAKING A QUANTITATIVE APPROACH

179

But was the change in teacher numbers the cause of the change in ‘5 A–C’? Look at

Figure 11.1 Actually, it may be the other way around (reverse causation). For example,

the ‘5 A–C’ may have dropped one year, and encouraged parents who are interested in

academic success to send their children elsewhere. With fewer pupils, the numbers on

roll go down and the same number of teachers is no longer required.

Let’s look at a different example, which tries to explain observed differences between

two groups. The head teacher at school A knows that different tutor groups have differ-

ent levels of truancy. He uses a statistical test to confirm that tutor groups led by women

have less truancy than those led by men. Is he justified in concluding that female form

tutors are better?

Not necessarily. At this school, the extent of truancy happens to vary with age (older

children truant more), the Year 9 tutor team is composed mainly of women and the Year

11 tutor team is composed mainly of men. Hence, we’ve a third variable – year group –

which affects both truancy levels and the sex of the tutor (Figure 11.2), creating a plausi-

bly causal, but in fact artifactual, relationship between a tutor’s sex and truancy levels.

So it is important to realize that finding a relationship does not mean that it is a causal

relationship, and that causation could happen in either direction. In fact, sometimes a

relationship may be pure coincidence – there is a proven relationship between stork

population size and human birth rate!

To some extent, you can control the effect of potentially confounding variables by

how you choose your sample, and by what you measure. For example, the head teacher

above could do the following:

• Randomly choose, and compare, equal numbers of male- and female-led tutor groups

from each year group. This does reduce the sample size, but removes the effect of age.

• Compare multiple pairs of tutor groups, matched as closely as possible by the chil-

dren’s socio-economic status and other potentially relevant variables, with the only

difference being the tutors’ sex.

• Decide which variables are likely to affect truancy levels, and measure them at the

same time as collecting the rest of the data. Some clever statistics (beyond the scope

of this book) can remove the effect of the extra variables, and ‘uncover’ any effect of

the tutor’s sex.

Variable A causes a

change in variable B

Variable A

(Number of staff)

Variable B

(‘5A–C’)

Variable B causes a

change in variable A

Figure 11.1

Making sense of a correlation between two variables

12-Wilson-Ch-11.indd 179

8/31/2012 5:41:31 PM

Figure 9.1

Making sense of a correlation between two variables

Taking A Quantitative Approach

434

Figure 9.2

Making sense of a correlation between three variables

Even with these strategies, it is still difficult to know you’ve accounted for all potentially

confounding variables. As such, your interpretation of the data is vulnerable to

your own biases, and it is often possible to come up with contradictory explanations for

the same set of data. You should discuss such explanations when you write up your

research, pointing out the limitations of your derived ideas. Using other sources of data

to shed more light on such alternatives would also be sensible. In an ideal world, explana-

tions and hypotheses would be tested by experiment at a later date, but that’s often

not possible. The benefits and potential pitfalls of the non-experimental approach are

summarized in Box 11.5 and Box 11.6.

SCHOOL-BASED RESEARCH

180

There is a relationship

between variable A and

variable B

A change in variable C

causes a change in

variable B

A change in variable C

causes a change in

variable A

Variable A

(Tutor’s sex)

Variable B

(Truancy level)

Variable C

(Year team)

Figure 11.2 Making sense of a correlation between three variables

Even with these strategies, it is still difficult to know you’ve accounted for all poten-

tially confounding variables. As such, your interpretation of the data is vulnerable to

your own biases, and it is often possible to come up with contradictory explanations for

the same set of data. You should discuss such explanations when you write up your

research, pointing out the limitations of your derived ideas. Using other sources of data

to shed more light on such alternatives would also be sensible. In an ideal world, expla-

nations and hypotheses would be tested by experiment at a later date, but that’s often

not possible. The benefits and potential pitfalls of the non-experimental approach are

summarized in Box 11.5 and Box 11.6.

BOX 11.5 Benefits of a non-experimental approach

1 It is the ‘next best thing’ where an experimental approach is impossible.

2 It is good for initial exploration of a data set and for generating hypotheses,

which can be experimentally tested later.

3 It avoids using an artificial intervention, examining only ‘natural’ variation in the data.

12-Wilson-Ch-11.indd 180

8/31/2012 5:41:32 PM

BOX 9.5

Benefits of a non-experimental approach

1. It is the ‘next best thing’ where an experimental approach is impossible.

2. It is good for initial exploration of a data set and for generating hypotheses,

which can be experimentally tested later.

3. It avoids using an artificial intervention, examining only ‘natural’ variation in

the data.

Taking A Quantitative Approach

435

LOOKING AT OTHER PEOPLE’S DATA

Looking at data with a healthily sceptical eye is always a good idea! In the rest of the chap-

ter, I’ll help you to examine the story of pupil and school performance data.

League tables

When league tables were first introduced in 1992, exam results were used to decide if a

school was effective – on the assumption that better results mean better schooling. Ofsted

based their assessment of school performance on raw exam results, and teachers were

(and are) expected to use pupils’ attainment data to support their own promotion appli-

cations. So what’s the problem?

BOX 9.6

Pitfalls of a non-experimental approach

1. It is difficult to control variables or get truly random samples.

2. It is difficult to identify all the variables (including the one(s) which actually has

a causative effect) that are potentially relevant to an observed phenomenon.

3. Cause and effect are not always obvious; it is easy to assume one variable

causes another if it fits in with your prior ideas.

Activity 9.1

Many commentators have criticized ‘exam results’ as not being a valid measure

of school effectiveness, and suggested that they should not be used to compare

schools. Think about the following questions.

• Why do ‘exam results’ only give a limited picture of school ‘effectiveness’?

• What other variables could affect exam results, which are unrelated to the

school itself?

• Suggest reasons why it is not valid to reward secondary science teachers on

the basis of their classes’ exam results.

Taking A Quantitative Approach

436

Value-added

In 2004, the Department for Education and Skills published a value-added measure; it

was billed as a more valid measure of the ‘value’ that a school ‘adds’, taking into account

prior attainment. The aim was to help parents make valid comparisons between schools.

For primary schools, data were reported in relation to a figure of 100.

Let’s look at how they worked out value-added between Key Stage 2 and Key Stage 3,

using SATs results as a measure of attainment in each case, but comparing pupils with

similar levels of prior attainment. Here’s a simplified explanation of it:

• All the pupils in the country who performed very similarly (e.g. within half a level) at

Key Stage 2 are put in order of numerical result for their Key Stage 3 results.

• For each set of pupils, the median result (the middle one) is found, and every pupil is

given a value-added score relative to that result. If someone does better than the medi-

an pupil, they get a positive score. If someone does worse, they get a negative score.

• Hence, for schools with a low-attaining intake, it should be easier to see the ‘value’ they

are ‘adding’, even if their Key Stage 3 results are much lower than, for example, another

school which selects on prior attainment.

Activity 9.1 (Continued)

Hopefully, you’ve realized that there’s an enormous number of variables which

could affect exam results. Research suggests that 76 per cent of variation in

GCSE scores is determined by pupils’ prior attainment (which itself is related to

their own prior education and socio-economic characteristics). This means that

the influence of a secondary school on GCSE scores is pretty limited, and the

best way for schools to improve

Activity 9.2

Explain why this ought to be a more valid measure of school effectiveness, and

what you think it still lacks.

Taking A Quantitative Approach

437

To assess the whole school’s performance, you average the results of all the pupils in the

school and add the result to 100. Hence, a score of 100 reflects ‘average’ value-added, 98

reflects ‘below average’ value-added and 102 reflects ‘above average’ value-added.

Activity 9.3

A local journalist is looking at the value-added scores for the schools in her

county. She realizes that half of the schools have value-added scores below 100

and publishes the story with the following headline: ‘Scandal! Letting down our

children – half of Smedshire’s schools are below average’. What is wrong with

this headline?

But how do you interpret these scores? If the average (mean) value-added

score for a school is a long way below the mean value (100) for all the schools,

then you may be pretty confident that there’s a problem. If it’s really way above

the mean, then you know something is going right. But how far apart do schools’

scores need to be to know that they are really significantly different?

Activity 9.4

Look at Figure 11.3. For each school, it shows the mean and confidence intervals.

Which two of the following is it possible to say with 95 per cent confidence?

• School A is significantly better than school B.

• School D is significantly worse than school B.

• School C is significantly better than the national average.

• There is a significant difference between school A and school C.

Statisticians work this out by calculating something called confidence limits above and

below the mean for each school. This is often not quoted in the media, but without it, you

can’t interpret league tables properly.

• You can only say that two schools really have significantly different value-added scores if

the confidence intervals (between the upper and lower limit) do not overlap.

• You can only say that a school is above the national average if the school mean is

above the national mean, and if the school confidence interval does not include the

national mean.

Taking A Quantitative Approach

438

SCHOOL-BASED RESEARCH

184

However, it gets even more complicated. The way in which confidence limits are calcu-

lated means that:

• if you calculate the mean and confidence interval for a small school with few pupils,

the limits are likely to be relatively far apart

• if you calculate the mean and confidence interval for a school with a lot of pupils, the

limits are likely to be relatively close together.

These differences make it difficult to validly compare schools of different sizes, and even

more difficult when comparing particular subjects or classes with even fewer pupils.

1.5

0.5

−1

−0.5

National average

−1.5

Confidence limits

Mean

Figure 11.3 Understanding confidence limits 1

1.5

−0.5

0.5

National average

−1.5

−1

Figure 11.4 Understanding confidence limits 2

12-Wilson-Ch-11.indd 184

8/31/2012 5:41:33 PM

Figure 9.3

Understanding confidence limits 1

However, it gets even more complicated. The way in which confidence limits are calculated

means that:

• if you calculate the mean and confidence interval for a small school with few pupils, the

limits are likely to be relatively far apart

• if you calculate the mean and confidence interval for a school with a lot of pupils, the

limits are likely to be relatively close together.

These differences make it difficult to validly compare schools of different sizes, and even

more difficult when comparing particular subjects or classes with even fewer pupils.

Figure 11.4

Understanding confidence limits 2

SCHOOL-BASED RESEARCH

184

However, it gets even more complicated. The way in which confidence limits are calcu-

lated means that:

• if you calculate the mean and confidence interval for a small school with few pupils,

the limits are likely to be relatively far apart

• if you calculate the mean and confidence interval for a school with a lot of pupils, the

limits are likely to be relatively close together.

These differences make it difficult to validly compare schools of different sizes, and even

more difficult when comparing particular subjects or classes with even fewer pupils.

1.5

0.5

−1

−0.5

National average

−1.5

Confidence limits

Mean

Figure 11.3 Understanding confidence limits 1

1.5

−0.5

0.5

National average

−1.5

−1

Figure 11.4 Understanding confidence limits 2

12-Wilson-Ch-11.indd 184

8/31/2012 5:41:33 PM

Taking A Quantitative Approach

439

Contextual value-added

Using value-added scores still assumes that all changes in pupil performance are deter-

mined by what happens in the school. But what about the other variables that could affect

performance? Is it possible to tease apart their effects from the effects of the school?

Initial attempts to do this adjusted for the number of pupils on free school meals as an in-

dex of socio-economic status. Then in 2006 the government developed a more complicat-

ed statistical model. They tried to identify relevant variables, and remove the effect of each,

one by one, using clever statistics called multi-level modelling, to provide a way of predict-

ing ‘expected’ attainment for each pupil. The deviation from that ‘expected’ attainment then

provides a contextual-value-added score (CVA). The variables included in calculating the

model can change year-on-year, but commonly have included most of the following:

• pupil prior attainment

• gender

• Special Educational Needs

• first language

• measures of pupil mobility

• age

• an indicator of whether the pupil is ‘in care’

• ethnicity

• free School Meals

• Income Deprivation Affecting Children Index (IDACI)

• the average and range of prior attainment within the school (KS2–3, KS2–4 and KS3–4

only).

This sounds like the ‘holy grail’ for assessing school effectiveness, but there are some prob-

lems with this model in particular. The way in which the CVA model handles missing data

could generate bias, and some variables may also interact to affect CVA in nonlinear

ways (e.g. income deprivation may affect the attainment of younger pupils

Activity 9.5

Look at Figure 11.4. It shows mean and confidence intervals for three schools.

School A has 2000 pupils, school B has 500 pupils, and school C has 2000 pupils.

Which of the following can you say with 95 per cent confidence?

• School A is significantly below the national average.

• School B is significantly above the national average.

• School C is significantly above the national average.

• There is no significant difference between school A and school B.

• There is no significant difference between school B and school C.

• There is no significant difference between school A and school C.

Taking A Quantitative Approach

440

more than that of older pupils). Having pointed these things out, CVA is just one of

a number of value-added measures which have emerged commercially since 1991,

including Fischer Family Trust scores, and models derived by the Centre for

Educational Management (CEM) at the University of Durham (examples include

MIDYIS, Yellis and ALIS). There are differences in the population data used to derive

these models, and again in some of the variables included, but each is based on a

CVA-type approach.

The data provided by these kinds of models have been used as part of accountability

procedures (e.g. via Ofsted), to direct school improvement planning (e.g. for schools,

teachers or pupils), and by parents when choosing a school for their child. You’ll probably

recognize many of the names given above. Indeed, Ofsted compelled schools to

show evidence that they use CVA data in school improvement, providing it to schools

via an online platform called RAISE online, and CVA data are published for access by

parents. However, using the same data, and hence the same model, for each of these

purposes can be problematic, and the variables to be included in such models may need

to differ, depending on the use each model is put to.

For example, if CVA is to be a predictor of future attainment (as it is for students in

many of the CEM models), then every factor should be included in the model to

make such a prediction as accurate as possible. On the other hand, if CVA is to be

used as a focus for school improvement, then it is more sensible to include only

those variables which the school actually has control over. Likewise, if CVA is to be

used for accountability, then all stakeholders should be able to understand the meaning

of the indices produced, and hence complex multi-level modelling like this may

be inappropriate.

Given the above, the apparent precision and ‘authority’ of value added data like CVA

or Fischer Family Trust scores can mislead parents, teachers and children into trusting

such data unquestioningly. There are two good examples of when this becomes problemat-

ic with respect to how parents interpret CVA data.

1. We’ve already seen how parents can assume that a greater value-added score implies

a better school, without looking at the confidence intervals. However, given that CVA

models use data from pupils who have been in the school for, for example, the last

seven years, they may not actually provide insights for parents into future school

effectiveness (after all, if your child is just starting in a school you want to know how

it will perform in seven years’ time). Instead, they reflect the performance of the

school over previous years, and do not take into account any uncertainty associated

with predicting school performance into the future. When statisticians introduce

such prediction uncertainty into CVA-type models, schools’ predicted performance

is so similar, it becomes almost impossible to differentiate between schools on the

basis of their CVA score.

2. Although a CVA score for a school may provide a measure of school effectiveness,

controlled for the range of factors included in the model, when creating a model to help

parents choose a school, it is important not to control for school-level variables. This is be-

Taking A Quantitative Approach

441

cause a parent wants to know whether a particular school (with all its contextual features)

will provide a better education for their child (with his/her own characteristics) than an-

other school. If the school-level factors are associated with achievement, then this is part of

the effect the parent is interested in. Hence, using raw CVA data would not be appropriate.

Parents and others instinctively base their judgements on figures they understand, and

continuing to publish raw data about school performance is misleading. However, it is im-

portant to ensure parents and teachers realize that even value-added measures like CVA

still provide a narrow metric of school or teacher effectiveness, and that such data should

really be used only as a starting point for conversation, rather than as concrete evidence of

school or teacher effectiveness.

жүктеу/скачать 32,4 Mb.

Достарыңызбен бөлісу:

1 ... 46 47 48 49 50 51 52 53 ... 85