SPSS 2 spring 2011

Report
By Wendiann Sethi
Spring 2011

The second stages of using SPSS is data
analysis. We will review descriptive statistics
and then move onto other methods of data
exploration using crosstabulations,
inferences on the mean, regression and
ANOVA. Students are encouraged to bring
data that they are analyzing in class or
projects to discuss what methods would be
best to use.




Descriptive statistics
Identifying outliers
Missing data – by variable, by respondent
Manipulating data –
◦ reversing the scale
◦ Collapsing a continuous variable into groups

Choosing the right statistic
Data Analysis:

Multiple Responses

◦
◦
◦
◦
Cross tabulations
Correlations and regression
Tests about mean and proportions
ANOVA

Measures of central tendency
Measures of spread

Analyze> Descriptive Statistics

What do you use for each type of variable and
why?


Look at the mean, median, st.dev and
skewness to determine if there might be
outliers
Could also use a box plot to see
Two potential problems with missing data:
1.
Large amount of missing data – number of
valid cases decreases – drops the statistical
power
2.
Nonrandom missing data – either related to
respondent characteristics and/or to
respondent attitudes – may create a bias
Examine
missing data
By variable
By
respondent
By analysis
If no problem found, go directly to your analysis
If a problem is found:
Delete the cases with missing data
Try to estimate the value of the missing data



Use Analyze > Descriptive Statistics >
Frequencies
Look at the frequency tables to see how much
missing
If the amount is more than 5%, there is too
much. Need analyze further.
1.
2.
3.
4.
5.
6.
7.
8.
Use transform>count
Create NMISS in the target variable
Pick a set of variables that have more
missing data
Click on define values
Click on system- or user-missing
Click add
Click continue and then ok
Use the frequency table to show you the
results of NMISS

Use Analyze>descriptive statistics>crosstabs

Look to see if there is a correlation between
NMISS (row) and another variable (column)

Use column percents to view the % of missing
for the value of the variable


Proceed anyway
Estimate (impute) the missing data with
substituting the mean or median value



Recoding
Calculating
When to create a new variable versus creating
a new one.


What do you want to explore?
What do you need in your data to do that
exploration?


Analyze>Descriptive Statistics>Crosstab
Good for categorical data to see the
relationship between two or more variables

Statistics: correlation, Chi Square, association
Cells: Percentages – row or column

Cluster bar charts


Finding the relationship between two scale or
ordinal variables.

Analyze > Correlate > bivariate

Analyze > regression > linear

Aim:
find out whether a relationship exists
and determining its magnitude and direction
Two correlation coefficients:

Assumptions:

◦ Pearson product moment correlation coefficient –rinterval or ratio scale variables
◦ Spearman rank order correlation coefficient –rhoordered or ranked data
◦
◦
◦
◦
Related pairs of scores
Relationship of the variables is linear
Variables are measured at least at the ordinal level
Homoscedasticity – variability of y variable should
remain constant at all values of x variable




Aim:
Use after finding there is a correlation to
find an appropriate Linear model to predict the
results of the DV based on one or more IV’s
Assumptions:
◦
◦
◦
◦
Related pairs of scores
Relationship of the variables is linear
Variables are measured at least at the ordinal level
Homoscedasticity – variability of y variable should remain
constant at all values of x variable
Procedure:
Linear Regression
◦
◦
◦
◦
One IV to one DV
ANALYZE>REGRESSION>LINEAR
After placing the appropriate DV and IV, click STATISTICS
Click CONTINUE and then OK to run the analysis

Comparing the means of a scale (or ordinal) when
grouped by a category

Analyze > Compare means
◦ Means – simplest form DV – scale to be compared given the
IV – categories
◦ One-sample t-test : test the mean of the variable against a
set value.
◦ Independent samples t-test: looking at the difference of
two means of the variable given a grouping variable (twogroups only)
◦ Paired-samples t-test: looking at the difference of the
means when there is paired data (pre-test vs post-test)
◦ One-way ANOVA: comparing the means of dependent
variables (scale or ordinal) given a factor (one IV-category)

Aim: Testing the differences between the means of two
independent samples or groups
Requirements:

Assumptions:

Procedure:

◦ Only one independent (grouping) variable IV (ex. Gender)
◦ Only two levels for that IV (ex. Male or Female)
◦ Only one dependent variable (DV)
◦ Sampling distribution of the difference between the means is normally
distributed
◦ Homogeneity of variances – Tested by Levene’s Test for Equality of
Variances
◦
◦
◦
◦
◦
ANALYZE>COMPARE MEANS>INDEPENDENT SAMPLES T-TEST
Test variable – DV
Grouping variable – IV
DEFINE GROUPS (need to remember your coding of the IV)
Can also divide a range by using a cut point


Aim:used in repeated measures or correlated
groups designs, each subject is tested twice on the
same variable, also matched pairs
Requirements:
◦ Looking at two sets of data – (ex. pre-test vs. post-test)
◦ Two sets of data must be obtained from the same subjects
or from two matched groups of subjects

Assumptions:
◦ Sampling distribution of the means is normally distributed
◦ Sampling distribution of the difference scores should be
normally distributed

Procedure:
◦ ANALYZE>COMPARE MEANS>PAIRED SAMPLES T-TEST

Aim: looks at the means from several independent
groups, extension of the independent sample t-test
Requirements:

Assumptions:

Procedure:

◦ Only one IV
◦ More than two levels for that IV
◦ Only one DV
◦ The populations that the sample are drawn are normally
distributed
◦ Homogeneity of variances
◦ Observations are all independent of one another
ANALYZE>COMPARE MEANS>One-Way ANOVA
 Dependent List – DV
 Factor – IV

How to deal with questions were the
participant can choose several choices.

ANALYZE>MULTIPLE RESPONSE
◦ Define sets
◦ Frequencies
◦ Crosstabs

Example data: survey_sample.sav
◦ Eth1, 2, 3 – multiple response method
◦ News 1, 2, 3 – multiple dichotomy method
Wendiann Sethi
[email protected]
AS 202C or SC 128

similar documents