Data files

Throughout the SPSS Survival Manualyou will see examples of research that is taken from a number of different data files, survey.zip, error.zip, experim.zip, depress.zip, sleep.zip and staffsurvey.zip. To use these files, which are available here, you will need to download them to your hard drive or memory stick. Once downloaded you'll need to unzip the files. To do this, right click on the downloaded zip file and select 'extract all' from the menu. You can then open them within SPSS.

(To do this, start SPSS, click on the Open an existing data source button from the opening screen and then on More Files. This will allow you to search through the various directories on your computer to find where you have stored your data files. Find the file you wish to use and click Open.)

Survey.sav

This is a real data file, condensed from a study that was conducted by my Graduate Diploma in Educational Psychology students. The study was designed to explore the factors that impact on respondents' psychological adjustment and wellbeing. The survey contained a variety of validated scales measuring constructs that the extensive literature on stress and coping suggest influence people's experience of stress. The scales measured self-esteem, optimism, perceptions of control, perceived stress, positive and negative affect, and life satisfaction. A scale was also included that measured people's tendency to present themselves in a favourable or socially desirable manner. The survey was distributed to members of the general public in Melbourne, Australia and surrounding districts. The final sample size was 439, consisting of 42 per cent males and 58 per cent females, with ages ranging from 18 to 82 (mean=37.4).

Download survey.zip
Download PDF of questionnaire and codebook used for survey.zip (Adobe Reader required)
Download PDF of full questionnaire for survey.zip (Adobe Reader required)
Download the syntax file used to compute scale scores.sps

Error.sav

The data in this file has been modified from the survey.zip file to incorporate some deliberate errors to be identified using the procedures covered in Chapter 5. For information on the variables etc. see details on survey.zip.

Download error.zip (Adobe Reader required)

Experim.sav

This is a manufactured data set that was created to provide suitable data for the demonstration of statistical techniques such as t-test for repeated measures, and one-way ANOVA for repeated measures. This data set refers to a fictitious study that involves testing the impact of two different types of interventions in helping students cope with their anxiety concerning a forthcoming statistics course. Students were divided into two equal groups and asked to complete a number of scales (Time 1). These included a Fear of Statistics test, Confidence in Coping with Statistics scale and Depression scale. One group (Group 1) was given a number of sessions designed to improve mathematical skills, the second group (Group 2) was subjected to a program designed to build confidence in the ability to cope with statistics. After the program (Time 2) they were again asked to complete the same scales that they completed before the program. They were also followed up three months later (Time 3). Their performance on a statistics exam was also measured.

Download experim.zip (Adobe Reader required)

Manipulate.sav

This file contains data extracted from hospital records which allows you to try using some of the SPSS data manipulation procedures covered in Chapter 8 Manipulating the data. This includes converting text data (Male, Female) to numbers (1, 2) that can be used in statistical analyses and manipulating dates to create new variables (e.g. length of time between two dates).

Download manipulate.zip (Adobe Reader required)

Depress.sav

This file has been included to allow the demonstration of some specific techniques in Chapter 16. It includes just a few of the key variables from a real study conducted by one of my postgraduate students on the factors impacting on wellbeing in first time mothers. It includes scores from a number of different psychological scales designed to assess depression (details in Chapter 16 on Kappa Measure of Agreement).

Download depress.zip (Adobe Reader required)

Sleep.sav

This is real data file condensed from a study conducted to explore the prevalence and impact of sleep problems on various aspects of people's lives. Staff from a university in Melbourne, Australia were invited to complete a questionnaire containing questions about their sleep behaviour (e.g. hours slept per night), sleep problems (e.g. difficulty getting to sleep) and the impact that these problems have on aspects of their lives (work, driving, relationships). The sample consisted of 271 respondents (55% female, 45% male) ranging in age from 18 to 84 years (mean=44yrs).

Download sleep.zip
Download a PDF questionnaire and codebook used for sleep.zip(Adobe Reader required)

Staffsurvey.sav

This is a real data file condensed from a study conducted to assess the satisfaction levels of staff from an educational institution with branches in a number of locations across Australia. Staff were asked to complete a short, anonymous questionnaire (shown later in this Appendix) containing questions about their opinion of various aspects of the organisation and the treatment they have received as employees.

Download staffsurvey.zip
Download a PDF questionnaire and codebook used for staffsurvey.zip(Adobe Reader required)

Additional resources

Parallel analysis

In Chapter 15 on Factor Analysis I refer to the zipped file for the MonteCarlo PCA for Windows, which is available here.
To download the file for Mac, please visit
http://edpsychassociates.com/Watkins3.html

Practice exercises

Part One: Getting started

Before attempting these questions read through Chapters 1, 2 and 3 of the SPSS Survival Manual.

Designing a study

1.1 When choosing a scale for use in research, what are the two main characteristics you need to be aware of?

1.2 What are the two main types of reliability of a scale?

1.3 What measure is often used to indicate the internal consistency of a scale?

1.4 If you read that a scale had a Cronbach alpha value of .4 what would you think?

1.5 There are many different types of validity of a scale. Describe three.

1.6 If you were designing a questionnaire and wanted to measure respondents’ ages, which of the following formats, (a) or (b), would be better? Explain your choice.

(a) Please tick one of the following categories to indicate your age:
18-30 ___
31-45 ___
46-60 ___
61-80 ___
81+ ___

(b) Please indicate your age in years: _______

Preparing a codebook

1.7 There are a number of rules that must be obeyed when choosing variable names to use in SPSS (see Chapter 2). Use the following questions to review some of these rules.

(a) Can a variable name start with a number?
(b) What is the maximum number of characters that a variable name can have?
(c) Can variable names contain spaces?

1.8 For each of the following, indicate which is a suitable variable name. If not suitable, explain why.

(a) *q1
(b) and
(c) religion
(d) martialstatus
(e) a110q
(f) incom.hous
(g) 5optim
(h) optim5

Getting to know SPSS

1.9 In Chapter 3 of the SPSS Survival Manual you are taken on a guided tour of the basics of SPSS. The best way to learn this program is by using it. Open the data file survey.sav. To get you familiar with the program, try some of the activities below.

(a) Using the Data Editor window, go to the bottom of the file (using the scroll bar) and find out the ID number of the last case in the file.

(b) Explore the different menus available in SPSS.

  • Click on Graphs and find out what types of graphs are available.
  • Click on Analyze and discover the wide range of statistics available.

(c) Practise using the dialogue and sub-dialogue boxes by clicking on Analyze and then on Frequencies. Next, highlight the following variables and move them into the Variables box: sex, age, marital, educ, op1, op2, op3, op4, op5, op6. Then, move these variables out of the Variables box and use Cancel to escape from the dialogue box.

Download answers

Part Two: Preparing the data file

Before attempting these questions read through Chapters 4 and 5 of the SPSS Survival Manual.

Creating a data file and entering data

2.1 Before you start using SPSS to prepare a data file and run analyses, it is important to check the SPSS Options. This is covered at the start of Chapter 4 of the SPSS Survival Manual. Use the following questions to review this material.

(a) How do you change the order in which your variables are displayed (e.g. alphabetical order instead of file order)?
(b) How do you change the number of decimal places that are used as the default for new variables?
(c) If you wished to change the format used to display the output tables, where in the Options would you do this?

2.2 The best way to learn how to set up an SPSS data file is to actually work through each of the steps yourself. At the back of the SPSS Survival Manual you will find a codebook for survey.sav (the file that was used to generate some of the output throughout the book) and the questionnaire that this came from.

Use this codebook to set up a data file from scratch. Follow the procedures described in Chapter 4 to define each of the variables listed in the codebook. When you have finished, enter some pretend data—you can generate this data yourself by completing the survey presented in the appendix at the back of the Manual. Save the file as mydata.sav.

Screening and cleaning the data

It is very important that you check your data file for errors before beginning statistical analyses. These exercises give you practice with the process of screening your data and correcting errors.

2.3 Using the instructions provided in Chapter 5 of the SPSS Survival Manual, check the following categorical variables for out-of-range cases. You will need to open the survey.sav data file. Use the codebook provided in the appendix at the back of the Manual to guide you.

(a) sex
(b) marital status (marital)
(c) children (child)
(d) major source of stress (source)
(e) do you smoke? (smoke)

2.4 Using the instructions provided in Chapter 5 of the SPSS Survival Manual, check the following continuous variables for out-of-range cases. You will need to open the survey.sav data file. Use the codebook provided in the appendix at the back of the Manual to guide you.

(a) each of the items in the Optimism scale (op1 to op6)
(b) each of the items in the Life Satisfaction scale (lifsat1 to lifsat5)
(c) each of the items in the Perceived Stress scale (pss1 to pss10)

Download answers

Part Three: Preliminary analyses

Before attempting these questions read through Chapters 6, 7, 8, 9 and 10 of the SPSS Survival Manual.

Descriptive statistics

The first step in the analysis of any data file is to obtain descriptive statistics on each of your variables. These can be used to check for out-of-range cases, to explore the distribution of the scores, and to describe your sample in the Method section of a report.

3.1 Use the instructions in Chapter 6 and Chapter 7 of the SPSS Survival Manual to answer the following questions concerning the variables included in the survey.sav data file.

(a) What is the mean age of the sample? What is the age range of the sample (minimum and maximum values)?
(b) What is the percentage of males and females in the sample? Did any of the sample fail to indicate their gender?
(c) What percentage of the sample were smokers?
(d) Inspect the distribution of scores on the Total Negative Affect scale. How normal is the distribution? Are there any cases that you would consider outliers?

Using graphs to describe and explore the data

3.2 Using the data file survey.sav, follow the instructions in Chapter 7 of the SPSS Survival Manual to obtain the following graphs.

(a) histogram of scores on the Total Self-esteem scale (tslfest)
(b) bar graph of scores on the Total Self-esteem scale (tslfest) for males and females (sex), across the three age groups (agegp3)
(c) scatterplot of scores on age and total scores on the Optimism scale (toptim)
(d) boxplot of scores on the Total Negative Affect scale (tnegaff) for males and females
(e) line graph of the Total Self-esteem scale (tslfest) for males and females (sex), across the three age groups (agegp3)

Manipulating the data

This section includes a number of activities to help you review, and to apply, the material covered in Chapter 8 of the SPSS Survival Manual. You should read through this chapter before attempting these questions.

3.3 One of the things that many students initially find difficult is being able to identify when items in a scale need to be ‘reversed’ before being added to give a total score. It is essential that this is done correctly, otherwise the values obtained for the total scale do not mean anything. To give you some practice at this we will use the Perceived Control of Internal States scale (Pallant, 2000). The scale is shown below.

Using the scale provided, decide how much you either agree or disagree with each statement. Next to each statement, write the number that best indicates how you feel.

strongly disagree 1 2 3 4 5 strongly agree

1. ______ I don't have much control over my emotional reactions to stressful situations.
2. ______ When I'm in a bad mood I find it hard to snap myself out of it.
3. ______ My feelings are usually fairly stable.
4. ______ I can usually talk myself out of feeling bad.
5. ______ No matter what happens to me in my life I am confident of my ability to cope emotionally.
6. ______ I have a number of good techniques that will help me cope with any stressful situation.
7. ______ I find it hard to stop myself from thinking about my problems.
8. ______ If I start to worry about something I can usually distract myself and think about something nicer.
9. ______ If I realize I am thinking silly thoughts I can usually stop myself.
10. ______ I am usually able to keep my thoughts under control.
11. ______ I imagine there will be many situations in the future where silly thoughts will get the better of me.
12. ______ I have a number of techniques which I am confident will help me think clearly and rationally in any situation I might find myself.
13. ______ Even when under pressure I can usually keep calm and relaxed.
14. ______ I have a number of techniques or tricks that I use to stay relaxed in stressful situations.
15. ______ When I'm anxious or uptight there does not seem to be much that I can do to help myself relax.
16. ______ There is not much I can do to relax when I get uptight.
17. ______ I have a number of ways of relaxing that I am confident will help me cope.
18. ______ If my stress levels get too high I know there are things I can do to help myself.

Pallant, J. (2000). Development and evaluation of a scale to measure perceived control of internal states. Journal of Personality Assessment, 75 (2), 308-337.

The aim of this exercise is to identify which items to reverse (not to actually carry out the reversals on the items in the survey.sav data file as these have already been correctly reversed).

(a) Identify which items in the scale would need to be reversed so that high total scores would indicate high levels of perceived control.

As discussed in Chapter 8 of the SPSS Survival Manual the next step is to calculate total scores by adding together the items that make up each scale. The following two exercises give you some practice with this process.

3.4 Use the procedures covered in Chapter 8 to create (compute) the following new total scale scores. Create new total subscale scores for the Perceived Control of Internal States scale (this scale is shown above).

(a) To calculate the Emotion subscale, add items pc1 to pc6. Call this new variable pcemot.
(b) To calculate the Thoughts subscale, add items pc7 to pc12. Call this new variable pcthou.
(c) To calculate the Physical subscale, add items pc13 to pc18. Call this new variable pcphys.

Check the descriptive statistics (mean, standard deviation, minimum, maximum) for your new subscales.

3.5 In this exercise a variable with eight different responses will be recoded into another variable which has only two possible values. The variable you will be using is marital status. The question in the questionnaire used to collect this information is shown below.

What is your marital status? (please tick whichever applies)

[_] 1. single [_] 2. in a steady relationship [_] 3. living with partner
[_] 4. married for first time [_] 5. remarried [_] 6. separated
[_] 7. divorced [_] 8. widowed

Open the survey.sav data file.

(a) Run Frequencies on the variable marital status (marital) to find out how many people fall into each of the categories.

(b) Follow the instructions in Chapter 8 to create a new variable (relship) from the variable in the data file (marital). The new variable will only have two values, indicating whether a person is or is not in a relationship.

  • In the first group include people who are not in a relationship (single, separated, divorced, widowed). These will be coded 1.
  • In the second group include people who are in a relationship (steady relationship, living with partner, married for the first time, remarried). These will be coded 2.

(c) Run Frequencies on the new variable (relship) and compare this with the results of the Frequencies on the original variable (marital). Are there the correct number of cases in each of the new groups?

Checking the reliability of a scale

If you use scales or standardised measures in your research (this is common in psychological research) it is important to assess the reliability (internal consistency) of the scores on the scale in your sample. The following exercise gives you some practice in this process.

3.6 Follow the procedure in Chapter 9 of the SPSS Survival Manual to assess the reliability of the following scales. You will need to refer to the codebook in the appendix to identify the items that make up each of the scales.

(a) Optimism scale (op1 to op6)
(b) Perceived Stress scale (pc1 to pc18)
(c) Self-esteem scale (sest1 to sest10)

Choosing the right statistic

Many students find it difficult to identify which statistical technique to use to address their research questions. Chapter 10 of the SPSS Survival Manual will help you with this process.

3.7 For each of the following research situations identify which statistical technique could be used.

(a) Ann is interested in exploring the possibility of gender differences in levels of perceived stress.
(b) Ann would also like to explore the relationship between optimism and perceived stress. She suspects that higher levels of optimism would be associated with lower levels of perceived stress.
(c) Bill is interested in exploring the effect of both sex and age group on self-esteem scores. He is interested in the effect of each variable individually, and any interaction that may exist.
(d) Celia would like to know which is a better predictor of negative affect: optimism or self-esteem.
(e) If Celia were also concerned that age may be a confounding variable, how would she go about controlling for this variable in the analyses?
(f) David is interested in the question: Are younger people (18-29yrs) more likely to be smokers than older people (30-44yrs or 45+yrs)?
(g) Ellie conducts a study to find out if there is a significant change in depression levels across three time periods (prior to an intervention, after the intervention and at a three-month follow-up).
(h) Frank, a lecturer, wants to know who performed better on a statistics exam, males or females?
(i) Frank is also interested to find out if there was a relationship between age and exam score.

3.8 Review each of the situations listed in Exercise 3.7 and consider what non-parametric technique you would use if it was not appropriate to use a parametric test. (Hint: Not all will have a non-parametric alternative.)

Download answers

Part Four: Statistical techniques to explore relationships among variables

You should review the material in the introduction to Part Four and in Chapters 11, 12, 13, 14 and 15 of the SPSS Survival Manual before attempting these exercises.

Correlation

4.1 Using the data file survey.sav follow the instructions in Chapter 11 to explore the relationship between the total mastery scale (measuring control) and life satisfaction (tlifesat). Present the results in a brief report.

4.2 Use the instructions in Chapter 11 to generate a full correlation matrix to check the intercorrelations among the following variables.

(a) age
(b) perceived stress (tpstress)
(c) positive affect (tposaff)
(d) negative affect (tnegaff)
(e) life satisfaction (tlifesat)

4.3 Gill, a researcher, is interested in exploring the impact of age on the experience of positive affect (tposaff), negative affect (tnegaff) and perceived stress (tpstress).

(a) Follow the instructions in Chapter 11 of the SPSS Survival Manual to generate a condensed correlation matrix which presents the correlations between age with positive affect, negative affect and perceived stress.

(b) Repeat the analysis in (a), but first split the sample by sex. Compare the pattern of correlations for males and females. Remember to turn off the Split File option after you have finished this analysis.

Partial correlation

4.4 Follow the procedures detailed in Chapter 12 of the SPSS Survival Manual to calculate the partial correlation between optimism (toptim) and perceived stress (tpstress) while controlling for the effects of age. Compare the zero order correlations with the partial correlation coefficients to see if controlling for age had any effect.

Multiple regression

4.5 There are three main types of multiple regression analyses. What are they? When would you use each approach?

4.6 As part of the preliminary screening process it is recommended that you inspect the Mahalanobis distances produced by SPSS. What do these tell you?

4.7 The example used in the SPSS Survival Manual to demonstrate the use of standard multiple regression compares two control measures (PCOISS and Mastery) in terms of their ability to predict perceived stress. Repeat this analysis, this time using life satisfaction (tlifesat) as your dependent variable. Use the output to answer the following questions.

(a) Overall, how much of the variance in life satisfaction is explained by these two variables?
(b) Which of the independent variables (tpcoiss, tmast) is the best predictor of life satisfaction?
(c) Do both variables make a statistically significant contribution to the prediction of life satisfaction?

4.8 Follow the instructions in the SPSS Survival Manual to perform an hierarchical multiple regression, this time using negative affect as the dependent variable.

Factor analysis

4.10 There is some controversy in the literature concerning the underlying factor structure of one of the scales included in the questionnaire presented in the appendix of the SPSS Survival Manual. The Optimism scale was originally designed as a one-dimension (factor) scale which included some positively worded items and some negatively worded items. Recent studies suggest that it may in fact consist of two factors representing optimism and pessimism.

Conduct a factor analysis using the instructions presented in Chapter 15 to explore the factor structure of the optimism scale (op1 to op6).

Download answers

Part Five: Statistical techniques to compare groups

Before attempting these questions read through the introduction to Part Five and Chapters 16-22 of the SPSS Survival Manual.

T-tests

5.1 Using the data file survey.sav follow the instructions in Chapter 17 of the SPSS Survival Manual to find out if there is a statistically significant difference in the mean score for males and females on the Total Life Satisfaction Scale (tlifesat). Present this information in a brief report.

5.2 Using the data file experim.sav apply whichever of the t-test procedures covered in Chapter 17 of the SPSS Survival Manual that you think are appropriate to answer the following questions.

(a) Who has the greatest fear of statistics at time 1, males or females?
(b) Was the intervention effective in increasing students’ confidence in their ability to cope with statistics? You will need to use the variables, confidence time1 (conf1) and confidence time2 (conf2). Write your results up in a report.
(c) What impact did the intervention have on students’ levels of depression?

One-way analysis of variance

For exercises 5.3, 5.4 and 5.5, you will need to open the data file survey.sav.

5.3 Perform a one-way between-groups ANOVA to compare the levels of perceived stress (tpstress) for the five different age groups (agegp5), 18-24yrs, 25-32yrs, 33-40yrs, 41-49yrs and 50+yrs.

5.4 Perform post-hoc tests to compare the Self esteem scores (tslfest) for people across the three different age groups (use the agegp3 variable).

For the following exercise you will need to open the data file experim.sav.

5.5 Use one-way repeated measures ANOVA to compare the Fear of Statistics scores for the three time periods (time1, time2 and time3). Inspect the means plots and describe the impact of the intervention and the subsequent follow-up three months later.

Two-way between-groups ANOVA

5.6 For this exercise you will need to open the data file survey.sav. Follow the instructions in Chapter 19 of the SPSS Survival Manual to conduct a two-way ANOVA to explore the impact of sex and age group on levels of perceived stress. The three variables you will need are sex, agegp5 and tpstress.

(a) Interpret the results. Is there a significant interaction effect? Are the two main effects significant?
(b) Write up this analysis and the results in a report. (Don’t forget to report the means and standard deviations for each group.)

Mixed between-within subjects analysis of variance

5.7 In Chapter 20 of the SPSS Survival Manual we explored the impact of two different intervention programs (maths skills/confidence building) on participants’ fear of statistics. We found that both interventions were equally effective in reducing participants’ fear—that is, we found no differences between groups—but a significant difference across the three time periods. Repeat these analyses, but this time use confidence scores as the dependent variable. You will need to use the following variables: group, conf1, conf2 and conf3.

(a) Is there a significant interaction effect between type of intervention (group) and time?
(b) Is there a significant main effect for the within-subjects independent variable, time?
(c) Is there a significant main effect for the between-subjects independent variable, group (maths skills/confidence building)?

Multivariate analysis of variance

5.8 How does MANOVA differ from ANOVA?

5.9 In Chapter 21 of the SPSS Survival Manual it is recommended that you check the Mahalonobis distances before proceeding with MANOVA. What does this allow you to check for?

5.10 Which assumption is Box’s Test used to assess?

5.11 Follow the procedure detailed in Chapter 21 of the SPSS Survival Manual to perform a MANOVA to explore positive and negative affect scores for the three age groups (18-29yrs, 30-44yrs, 45+yrs). The three variables you will need are tposaff, tnegaff, agegp3. Remember to check your assumptions first.

Analysis of covariance

5.12 Under what circumstances would you want to consider using analysis of covariance?

5.13 What issues do you need to consider when you are selecting possible covariates?

5.14 Using the experim.sav data file, perform the appropriate analyses (including assumption testing) to compare the confidence scores for the two groups (maths skills, confidence building) at time 2, while controlling for confidence scores at time 1. The variables you will need are group, conf1, conf2.

5.15 Perform a two-way analysis of covariance to explore the question: Does gender influence the effectiveness of the two intervention programs designed to increase participants’ confidence in being able to cope with statistics training? You will need to assess the impact of sex and type of intervention (group) on confidence at time 2, controlling for confidence scores at time 1.

Non-parametric statistics

5.16 What is the difference between parametric techniques and non-parametric techniques?

5.17 What factors would you consider when choosing whether to use a parametric or a non-parametric technique?

5.18 For each of the following parametric techniques indicate the non-parametric alternative (if one exists).

(a) one-way between-groups ANOVA
(b) Pearson’s product-moment correlation
(c) independent samples t-test
(d) multivariate analysis of variance
(e) one-way repeated measures ANOVA
(f) paired samples t-test
(g) partial correlation
(h) one-way repeated measures ANOVA

5.19 Choose and perform the appropriate non-parametric test to address each of the following research questions.

(a) Using the survey.sav data file find out whether smokers are significantly more stressed than non-smokers. The variables you will need are smoke and total perceived stress (tpstress).

(b) Using the survey.sav data file compare the self-esteem scores across the three different age groups (18-29yrs, 30-44yrs, 45+yrs). The variables you will need are tslfest and agegp3.

(c) Using the survey.sav data file explore the relationship between optimism and negative affect. The variables you will need are toptim and tnegaff.

(d) Using the survey.sav data file explore the association between education level and smoking. The variables you will need are educ2 and smoke. Check the codebook and the questionnaire in the appendix of the SPSS Survival Manual for details on these two variables.

(e) Using the experim.sav data file compare the depression scores at time 1 and the depression scores at time 2. Did the intervention result in a significant change in depression scores? The variables you will need are depress1 and depress2.

(f) Using the experim.sav data file compare the depression scores for the three time periods involved in the study (before the intervention, after the intervention and at the three-month follow up). The variables you will need are depress1, depress2 and depress3.

Download answers