An introduction to educational research methods. Введение в образовательные исследовательские методы Білім беру-зерттеу әдістеріне кіріспе


DATA ANALYSIS – GENERATING DESCRIPTIVE STATISTICS



Pdf көрінісі
бет56/85
Дата06.03.2017
өлшемі32,4 Mb.
#8078
1   ...   52   53   54   55   56   57   58   59   ...   85

DATA ANALYSIS – GENERATING DESCRIPTIVE STATISTICS

1. Checking the accuracy of data entry

The very first thing you should check is that you have entered your data correctly, as no

matter how careful you are, mistakes can be made. This ‘cleaning’ of the data also starts

the process of examining the data. One of the most common errors is entering the wrong 

code for a particular response, for instance ‘11’ instead of ‘1’ (i.e. the value entered doesn’t 

match a code you’ve set up), and the quickest way of checking for this is to look at the 

frequency distribution (or summary of response) for each variable.

 

In SPSS, descriptive statistics, such as frequency distribution information, are ac-



cessed from the ‘analyse’ drop-down menu. There are various descriptive statistics options 

Analysing Quantitative Data

491


but frequency distribution tables are generated using ‘frequencies’. By selecting this op-

tion, SPSS produces a dialogue box asking you which variables in your database you want 

frequency information about. You need to select the variable(s) of interest and ask for the 

relevant information. SPSS then generates the frequency distribution table(s) requested, 

which is produced in an SPSS output file that appears on the screen. For each variable re-

quested, you will be able to see the different response categories listed in the first column 

of the table and the number of responses for that category in the second column. A quick 

glance down the first column will reveal whether there are any illegal responses. Tables 

12.2 and 12.3 show frequency distributions for two of my variables, ‘gender’ and ‘task moti-

vation (science)’ scale scores.



Table 10.2

 Frequency distribution for gender

AnAlysing QuAntitAtive DAtA

195


wrong code for a particular response, for instance ‘11’ instead of ‘1’ (i.e. the value 

entered doesn’t match a code you’ve set up), and the quickest way of checking for this 

is to look at the frequency distribution (or summary of response) for each variable.

In SPSS, descriptive statistics, such as frequency distribution information, are accessed 

from the ‘analyse’ drop-down menu. There are various descriptive statistics options but 

frequency distribution tables are generated using ‘frequencies’. By selecting this option, 

SPSS produces a dialogue box asking you which variables in your database you want 

frequency information about. You need to select the variable(s) of interest and ask for 

the relevant information. SPSS then generates the frequency distribution table(s) 

requested, which is produced in an SPSS output file that appears on the screen. For each 

variable requested, you will be able to see the different response categories listed in the 

first column of the table and the number of responses for that category in the second 

column. A quick glance down the first column will reveal whether there are any illegal 

responses. Tables 12.2 and 12.3 show frequency distributions for two of my variables, 

‘gender’ and ‘task motivation (science)’ scale scores.

Table 12.2  Frequency distribution for gender

Frequency

Per cent


Valid per cent

Cumulative per cent

Valid

Boys


710

41.2


43.0

43.0


 

Girls


942

54.7


57.0

100.0


 

Total


1652

95.9


100.0

Missing


9.00

71

4.1



Total

1723


100.0

If you don’t have any illegal responses, as here, you can move on to other descriptive 

analyses. If you want to save the frequency distribution data (which you might if you 

planned to copy and paste the table into a report), you can save the output file in the 

same way you would save any other file. If you have some rogue responses, you will 

need to go back to your data to find them. This can be done using ‘find’ in ‘data view’ 

(much as you would find and replace in Word or other packages). By looking at the 

identifier for the case identified, you can go back to the original data (for instance, the 

original questionnaire) to amend the response.

2. Producing variable summaries

While the frequency distribution provides information about all the responses for a par-

ticular variable, quite often we want to give a more succinct summary, especially for 

variables measured at an interval level. For instance, knowing that the task motivation 

13-Wilson-Ch-12.indd   195

8/31/2012   5:41:40 PM

If you don’t have any illegal responses, as here, you can move on to other descriptive anal-

yses. If you want to save the frequency distribution data (which you might if you planned 

to copy and paste the table into a report), you can save the output file in the same way 

you would save any other file. If you have some rogue responses, you will need to go back 

to your data to find them. This can be done using ‘find’ in ‘data view’ (much as you would 

find and replace in Word or other packages). By looking at the identifier for the case 

identified, you can go back to the original data (for instance, the original questionnaire) to 

amend the response.

2. Producing variable summaries

While the frequency distribution provides information about all the responses for a partic-

ular variable, quite often we want to give a more succinct summary, especially for variables 

measured at an interval level. For instance, knowing that the task motivation (science) scale 

scores in my study varied from 8 to 30 does not really help paint a picture of the over-

all response to this scale. If I told you a particular student scored 18, you wouldn’t know 

without looking carefully at Table 12.2 whether this seemed typical or not. We really need 

to know two things to make judgements about individual scores.

Firstly: what is a typical response? For this, we would normally calculate the mean score 

(obtained by adding up every individual score and dividing the sum by the number of 

scores included). Secondly: how much variation is there in the responses? If there is a

small amount of variation, then most people are recording the same responses. A larger



Analysing Quantitative Data

492


variation implies that scores that markedly differ from the average (lower and higher)

are not that unusual. The commonly used measure for this is the standard deviation,

which is one measure of the average deviation from the mean score.

Table 10.3

 Frequency distribution for task motivation (science)

SCHOOL-BASED RESEARCH

196


(science) scale scores in my study varied from 8 to 30 does not really help paint a picture 

of the overall response to this scale. If I told you a particular student scored 18, you 

wouldn’t know without looking carefully at Table 12.2 whether this seemed typical or 

not. We really need to know two things to make judgements about individual scores. 

Firstly: what is a typical response? For this, we would normally calculate the mean score 

(obtained by adding up every individual score and dividing the sum by the number of 

scores included). Secondly: how much variation is there in the responses? If there is a 

small amount of variation, then most people are recording the same responses. A larger 

variation implies that scores that markedly differ from the average (lower and higher) 

are not that unusual. The commonly used measure for this is the standard deviation, 

which is one measure of the average deviation from the mean score.

Table 12.3  Frequency distribution for task motivation (science)

Frequency

Per cent


Valid per cent

Cumulative per cent

Valid

8.00


1

.1

.1



.1

 

9.00



1

.1

.1



.1

 

10.00



3

.2

.2



.4

 

11.00



1

.1

.1



.4

 

13.00



4

.2

.3



.7

 

14.00



6

.3

.4



1.1

 

15.00



9

.5

.6



1.8

 

16.00



8

.5

.6



2.3

 

17.00



14

.8

1.0



3.3

 

18.00



31

1.8


2.2

5.5


 

19.00


32

1.9


2.2

7.7


 

20.00


52

3.0


3.6

11.3


 

21.00


66

3.8


4.6

16.0


 

22.00


88

5.1


6.2

22.1


 

23.00


111

6.4


7.8

29.9


 

24.00


131

7.6


9.2

39.1


 

25.00


144

8.4


10.1

49.2


 

26.00


167

9.7


11.7

60.9


 

27.00


159

9.2


11.1

72.0


 

28.00


167

9.7


11.7

83.7


 

29.00


126

7.3


8.8

92.5


 

30.00


107

6.2


7.5

100.0


 

Total


1428

82.9


100.0

Missing System

1

295


17.1

Total


1723

100.0


Note

Missing system responses appear because students had not supplied answers for one or more of the 



questionnaire items that make up the task motivation (science) scale.

13-Wilson-Ch-12.indd   196

8/31/2012   5:41:40 PM

 

Summary statistics including the mean and standard deviation (usually referred to 



in textbooks as descriptive statistics) are obtained using the ‘descriptives’ option of descrip-

tive statistics. Summary statistics for ‘gender’, ‘task motivation (science)’ and ‘level of cogni-

tive development’ are shown in Table 10.4.

 

Returning to the question of whether 18 is a low score on the task motivation 



(science) scale, knowing that the mean score for this scale is 25.0 (rounded off) and that 

Analysing Quantitative Data

493


the standard deviation is 3.6 suggests that a score of 18 is unusually low, as most people 

are scoring between 21.4 (3.6 below the mean of 25) and 28.6.

 

Table 12.4 also illustrates an important point to bear in mind when using a com-



puter program. As gender is a nominal-level variable, it makes no sense to talk about the 

average or mean gender, as clearly my participants were either boys or girls. Yet SPSS will 

calculate this value if asked. You always need to ask yourself the question, is what I’m asking 

the computer to calculate sensible? In simple terms, if you put rubbish in, you will get rub-

bish out!

Table 10.4

 Descriptive statistics for gender, task motivation (science) and level of cognitive 

development

AnAlysing QuAntitAtive DAtA

197

Summary statistics including the mean and standard deviation (usually referred to in 



textbooks as descriptive statistics) are obtained using the ‘descriptives’ option of 

descriptive statistics. Summary statistics for ‘gender’, ‘task motivation (science)’ and 

‘level of cognitive development’ are shown in Table 12.4.

Returning to the question of whether 18 is a low score on the task motivation 

(science) scale, knowing that the mean score for this scale is 25.0 (rounded off) and that 

the standard deviation is 3.6 suggests that a score of 18 is unusually low, as most people 

are scoring between 21.4 (3.6 below the mean of 25) and 28.6.

Table 12.4 also illustrates an important point to bear in mind when using a computer 

program. As gender is a nominal-level variable, it makes no sense to talk about the aver-

age or mean gender, as clearly my participants were either boys or girls. Yet SPSS will 

calculate this value if asked. You always need to ask yourself the question, is what I’m 

asking the computer to calculate sensible? In simple terms, if you put rubbish in, you 

will get rubbish out!

Table 12.4  Descriptive statistics for gender, task motivation (science) and level of cognitive development

N

Minimum


Maximum

Mean


Std Deviation

Gender


1652

1.00


  2.00

  1.5702


  .49519

Task motivation (science)

1428

8.00


30.00

24.9965


3.60574

Level of cognitive Development

1723

2.00


  9.00

  6.1977


2.18647

3. Producing graphs

It is often easier to get a sense of how frequency distributions look by plotting graphs 

rather than looking at tables or summary statistics. For instance, you have seen the 

frequency distribution table and know that the mean score on the task motivation 

(science) scale was 25.0, while scores ranged from 8 to 30 and there was a relatively 

small amount of variation as the standard deviation was 3.6. If you think about these 

figures carefully, they suggest that scores were bunched up at the higher end of the 

scale. This can be seen much more clearly by plotting a histogram, which is a graph of 

the frequency distribution.

In SPSS, graphs can be produced through the ‘graph’ drop-down menu, which gives 

you the option of producing a histogram. The histogram for the task motivation (science) 

scale is shown in Figure 12.1 and, as expected, students’ scores are bunched up towards 

the top of the scale. This scale is about learning things so it is hardly surprising that most 

students say they are motivated by this, given that they know their answers are going to 

be scrutinized by someone else.

13-Wilson-Ch-12.indd   197

8/31/2012   5:41:40 PM



3. Producing graphs

It is often easier to get a sense of how frequency distributions look by plotting graphs 

rather than looking at tables or summary statistics. For instance, you have seen the fre-

quency distribution table and know that the mean score on the task motivation (science) 

scale was 25.0, while scores ranged from 8 to 30 and there was a relatively small amount 

of variation as the standard deviation was 3.6. If you think about these figures carefully, 

they suggest that scores were bunched up at the higher end of the scale. This can be 

seen much more clearly by plotting a histogram, which is a graph of the frequency distri-

bution.

 

In SPSS, graphs can be produced through the ‘graph’ drop-down menu, which 



gives you the option of producing a histogram. The histogram for the task motivation 

(science) scale is shown in Figure 12.1 and, as expected, students’ scores are bunched up 

towards the top of the scale. This scale is about learning things so it is hardly surprising 

that most students say they are motivated by this, given that they know their answers are 

going to be scrutinized by someone else.


Analysing Quantitative Data

494


Figure 10.1 

Histogram showing the distribution of response to the task motivation (sci-

ence) scale

A second type of graph that is useful when conducting a descriptive analysis of

data, which is also an option from the graph menu, is a scatter diagram. This shows

the relationship between two interval-level variables. For instance, I was very interested

in knowing whether there was a relationship between students’ cognitive

development scores and their task motivation scores. I expected students who said

they were motivated to try hard to learn new things to develop cognitively. A scatter

diagram would help me decide whether I was right about this relationship and this is

shown in Figure 10.2.

SCHOOL-BASED RESEARCH

198

A second type of graph that is useful when conducting a descriptive analysis of 



data, which is also an option from the graph menu, is a scatter diagram. This shows 

the relationship between two interval-level variables. For instance, I was very inter-

ested in knowing whether there was a relationship between students’ cognitive 

development scores and their task motivation scores. I expected students who said 

they were motivated to try hard to learn new things to develop cognitively. A scatter 

diagram would help me decide whether I was right about this relationship and this is 

shown in Figure 12.2.

200


150

100


50

0

10.00



15.00

20.00


25.00

30.00


Task motivation (science)

Frequency

Mean 


= 24.9965

Std Dev 


= 3.60574

= 1-428



Figure 12.1  Histogram showing the distribution of response to the task motivation (science) scale

13-Wilson-Ch-12.indd   198

8/31/2012   5:41:41 PM


Analysing Quantitative Data

495


AnAlysing QuAntitAtive DAtA

199


Each circle represents one or more students’ scores on the variables in question. For 

instance, the circle at the bottom right of the graph represents students that scored 30 

(the maximum value) on the task motivation (science) scale (so are highly motivated 

to learn new things), but at the same time got the lowest score of 2 on the level of 

cognitive development test. So, for these students, being very keen to learn new things 

does not appear to be associated with helping them to develop cognitively. In general, 

if there was a positive association, which statisticians call a positive correlation, 

between task motivation and level of cognitive development, the graph would show a 

series of circles falling around an imaginary line sloping from the bottom left to the top 

right of the graph (i.e. sloping upwards). Here, we have a general swirl of dots filling 

most of the graph, suggesting no relationship (or no correlation) between task motiva-

tion and level of cognitive development. This is not what I expected. Note that it is also 

possible to have a negative correlation between two variables. For instance, you would 

9.00


8.00

7.00


6.00

5.00


4.00

3.00


2.00

5.00


10.00

15.00


20.00

25.00


30.00

Task motivation (science)

Level of cognitive develpment

Figure 12.2   Scatter diagram showing the relationship between task motivation (science) and level of 

cognitive development

13-Wilson-Ch-12.indd   199

8/31/2012   5:41:41 PM

Figure 10.2 

Scatter diagram showing the relationship between task motivation (science) 

and level of cognitive development

Each circle represents one or more students’ scores on the variables in question. For in-

stance, the circle at the bottom right of the graph represents students that scored 30 (the 

maximum value) on the task motivation (science) scale (so are highly motivated to learn 

new things), but at the same time got the lowest score of 2 on the level of cognitive devel-

opment test. So, for these students, being very keen to learn new things does not appear 

to be associated with helping them to develop cognitively. In general, if there was a positive 

association, which statisticians call a positive correlation, between task motivation and level 

of cognitive development, the graph would show a series of circles falling around an imag-

inary line sloping from the bottom left to the top right of the graph (i.e. sloping upwards). 

Here, we have a general swirl of dots filling most of the graph, suggesting no relationship 

(or no correlation) between task motivation and level of cognitive development. This is not 

what I expected. Note that it is also possible to have a negative correlation between two 

variables. For instance, you would expect a negative correlation between alienation and 

cognitive development, which would appear as a band of circles following an imaginary line 

from the top left to the bottom right (i.e. sloping downwards).



Analysing Quantitative Data

496


DATA ANALYSIS – INFERENTIAL STATISTICS

Having explored the data and gained a sense of the relationships between different

variables, it is now time to get to the crux of the matter and answer the original

research questions. I was interested in the relationship between motivation and level

of cognitive development. I predicted that students scoring highly on the task motivation

scale (the desire to learn and master new things) would have overall higher

scores on the test of cognitive development than students with low scores on the task

motivation scale. Similarly, I would expect that students with high scores on the

alienation scale (actively disrupting learning) would have overall lower scores on the

test of cognitive development than students who have low scores on the alienation

scale. I also thought that girls would have higher scores on the task motivation scale

and lower scores on the alienation scale than boys, and that girls’ level of cognitive

development would be higher.

In order to assess whether these hypotheses are correct, we need to run a number of

statistical tests. Essentially, I am asking two different types of question:

1.  Is there a relationship between two variables (for instance, between task motivation 

and level of cognitive development)?

2.  Is there a difference between two groups on a given variable (for instance, between 

boys and girls on task motivation)?

Each type of question requires a specific statistical test, and these are outlined below.



1. Tests of correlation

The first type of question relates to relationships or correlations between variables,

therefore the appropriate statistical test in this case is a test of correlation. The logic

behind this type of test is that we look at the actual relationships found (so we need

to calculate a particular entity to assess this) and then make a judgement as to how

likely this result would be, if in fact there wasn’t a relationship between the variables

of interest. In essence, we are judging the likelihood that the results we found were

a fluke, i.e. in reality, motivation and cognitive development aren’t related in any way;

it just happened in this sample of schools that there was some type of relationship.

The reason we have to take this approach is because we simply don’t know whether there 

is in fact a relationship or not, and we have to make the best judgement we can based on 

the data we have on the balance of probabilities. These probabilities can only be calculated 

by starting from a position that the variables are not related. A slightly different starting 

point is taken when we are looking at differences between groups, which is described 

below. However, it is the case that the calculations conducted in any statistical test involve 

calculating probabilities to enable the person running the test to make a judgement call. 

This is why the outcomes of the calculations made are referred to as inferential statistics 

and the tests themselves are often called significance tests.

 

The entity calculated in a test of correlation to quantify the relationship between 



Analysing Quantitative Data

497


the two variables of interest is a correlation coefficient. The statistical test enables me to 

judge whether this is significant (i.e. the balance of probabilities is that there is a relation-

ship between motivation and cognitive development). At this point, I have several choices 

of correlation coefficients to calculate, dependent on the measurement level and distribu-

tion of my variables. If you intend to use inferential statistics, you will need to read up on 

this in more detail, as all I am doing here is giving an introduction to this area. Specifically, 

you can choose to do either a parametric or non-parametric test. The former is generally 

preferred because it is more powerful and sensitive to your data, however it also makes 

certain assumptions about your data. For reasons there isn’t the space here to explain, my 

data are acceptable for a parametric test, hence I need to calculate the appropriate cor-

relation coefficient, a Pearson correlation coefficient, and then look at the significance test 

results. In SPSS, this procedure is conducted in the ‘correlate’ option of the analyse menu. 

The results for the test of correlation between level of cognitive development and task 

motivation (science) scores are shown in Table 10.5.




Достарыңызбен бөлісу:
1   ...   52   53   54   55   56   57   58   59   ...   85




©emirsaba.org 2024
әкімшілігінің қараңыз

    Басты бет