Chi-square Test of Independence Presentation 10.2 Another Significance Test for Proportions • But this time we want to test multiple variables. • With this test we.

Report
Chi-square Test of
Independence
Presentation 10.2
Another Significance Test for
Proportions
• But this time we want to test multiple
variables.
• With this test we can determine if two
variables are independent of not.
• This is sometimes called inference for twoway tables.
Chi-square Test of Independence
Formulas
The null and
alternate
hypotheses are
always the same
with a Test of
Independence.
Null Hypothesis
(assumes independent)
Alternate Hypothesis
(not independent)
Test Statistic
(that symbol is called
“Chi-squared”)
H 0 : Observed
 Expected
H a : Observed
 Expected
 
2

O  E  2
E
df   # of rows  1  # of columns  1 

P  Value   cdf  ,9999 , df
2
Instead of a normal
or t distribution, we
now have a chisquared distribution
2

O is the
observed count
for each cell in
the table and E
is the expected
count for each
cell in the table.
The Titanic
• Look at the data of the passengers, their ticket, and
whether or not they survived.
Type of Ticket
First Class
Second Class
Third Class
Rescued
203
118
528
Died
123
167
178
Conditions for the Test of
Independence
• None of the observed counts should be
less than 1
• No more than 20% of the counts should be
less than 5
– Same as for the Goodness of Fit test
• These are simple checks to make sure
that the sample size is sufficient.
The Titanic
• Check the conditions
– Since all counts are much greater than 5, we are ok
to conduct the test
• Write Hypotheses (these are always the same!)
– Null: Ho: Observed = Expected
• That is, what we observed should be the same as what we
expected given the variables are independent
– Alternate: Ha: Observed ≠ Expected
• That is, the observed data is just too different from what is
expected to be attributed to random chance.
The Titanic Calculations
• Find the expected
values (assume
independence)
Observed
Type of
Ticket
Rescu
ed
Died
Totals
First
Class
203
123
326
Secon
d Class
118
167
285
Third
Class
528
Totals
849
178
468
Expected
Type of Ticket
Rescued
Died
Totals
First Class
326*849/1317=
210
326*468/1317=
116
326
Second Class
285*849/1317=
184
285*468/1317=
101
285
Third Class
706*849/1317=
455
706*468/1317=
251
706
Totals
849
468
1317
706
1317
To find an expected count, 849 out of 1317 total passengers were rescued
(64.46%), so 849/1317 or 64.46% of the 326 first class passengers should
have been rescued. This logic follows for each cell in the table.
The Titanic Calculations
•
Then, do the sum of
just like with the
Goodness of Fit Test
• Our degrees of
freedom are:
• Finally, use chisquare cdf:
X2cdf(99.69,99999,2)
 
2

O  E 
2
E
df   rows  1 columns  1 
df  3  1  2  1 
df  2
The Titanic Calculations
• Using the calculator
• First go to the Matrix menu (2nd
x-1)
• Go to edit and press enter
• Enter the number of row x
column
– Your matrix should fit the look
of your table
• Enter in the data
– Make the calculator match the
table
• Then go to your stats tests and
choose chi-test
The Titanic Calculations
• Using the calculator
• Since you entered the
data into matrix [A], you
can just go right to:
– Calculate
– Draw
• Leave the expected alone
as the calculator will
calculate those for you
(see next slide)
The Titanic Calculations
• Using the calculator
• Let’s go check out the
expected table
– Go back to matrix
– Edit [B] to see the
values
• How cool is that!
The Titanic Calculations
• Conclusions
– The p-value represents the chance of the data
occurring given the variables are independent.
– For the Titanic, this was a
0.00000000000000000002% chance
– REJECT THE NULL!
– There is a ton of evidence to suggest that there is an
association between survival rate and the type of
ticket.
Chi-square Goodness of Fit
Test
This concludes this presentation.

similar documents