test statistic

```BS704 Class 7
Hypothesis Testing Procedures
HW Set #6
Chapter 7
Problems 4, 7, 8, 20 and 25
R Problem Set 6 (on Blackboard)
Due October 26
Please complete Quiz 8 Before Oct 26
Objectives
 Define null and research hypothesis,
test statistic, level of significance and
decision rule
 Understand Type I and Type II errors
 Differentiate hypothesis testing
procedures based on type of outcome
variable and number of samples
Hypothesis Testing
 Research hypothesis is generated
 Sample data are analyzed and
determined to support or refute the
research hypothesis
Example: A large, national study was conducted
in 2007 and found that the mean systolic blood
pressure for males aged 50 was 130. In 2008,
an investigator hypothesizes systolic blood
pressures have increased.
To test this hypothesis we set up two competing
hypotheses
Null
Research
H0: m = 130
H1: m > 130
To test the hypotheses a random sample is
selected from the population of interest.
Suppose a sample of n=108 males age 50
in 2008 is selected and their systolic
blood pressures are analyzed. Of
primary interest is the mean systolic
blood pressure in the sample ( X ).
If the sample mean is 130, which is
more likely true?
1. H0
2. H1
o
0%
N
Ye
s
0%
If the sample mean is 150, which is
more likely true?
1. H0
2. H1
o
0%
N
Ye
s
0%
If the sample mean is 135, which is
more likely true?
1. H0
2. H1
o
0%
N
Ye
s
0%
We must determine a critical value such
that if our sample mean is less than the
critical value we will conclude that H0 is
true (i.e., m=130), and if our sample
mean is greater than the critical value
we will conclude that H1 is true (i.e., m >
130).
Instead of determining critical values for
the sample mean - which would be
specific to each application (since
depends on the unit of measurement), appeal to the Central Limit Theorem.
For large n - X is approximately normally
distributed. Assuming that H0 is true (or
"under the null hypothesis") we can
standardize , producing a Z score (test
statistic).
If X  130 then Z  0  H0 probably true.
If X>130 then Z > 0  H1 probably true.
What value of Z is considered “large” ?
In Example suppose that s=15 and n=108.
We must select a level of significance,
denoted a, which is defined as the
probability of rejecting H0 when H0 is
true.
The level of significance is generally in
the range of 0.01 to 0.10.
Once a level of significance is selected, a
decision rule is formulated.
Decision rule:
Reject H0 if Z > 1.645
Do Not Reject H0 if Z < 1.645
Once the decision rule is in place, we compute the
value of the test statistic.
Suppose in example, X=135.
X - μ 0 135- 130
Z=
=
= 3.46
s
15
n
108
The final step - draw a conclusion.
The test statistic falls in the rejection
region - we reject H0 (3.46 > 1.645).
We have significant evidence, a =
0.05, to show that the mean systolic
blood pressure for males aged 50 in
2008 has increased from 130.
Hypothesis Testing Procedures
1. Set up null and research
hypotheses, select a
2. Select test statistic
3. Set up decision rule
4. Compute test statistic
5. Draw conclusion & summarize
significance (p-value)
P-values
 P-values represent the exact
significance of the data
 Estimate p-values when rejecting H0
to summarize significance of the data
(can approximate with statistical
tables, can get exact value with
statistical computing package)
 P-value is the smallest a where we
still reject H0
Hypothesis Testing for m
 Continuous outcome
 1 Sample
H0: m=m0
H1: m>m0, m<m0, m≠m0
Test Statistic
X - μ0
Z=
n>30
(Find critical
s/ n
n<30
t=
X - μ0
s/ n
value in Table 1C,
Table 2)
Hypothesis Testing for m
The National Center for Health Statistics
(NCHS) reports the mean total
cholesterol for adults is 203. Is the
mean total cholesterol in Framingham
Heart Study participants significantly
different?
In 3310 participants the mean is 200.3
with a standard deviation of 36.8.
Hypothesis Testing for m
1. H0: m=203
H1: m≠203
a=0.05
2. Test statistic
Z=
X - μ0
s/ n
3. Decision rule
Reject H0 if z > 1.96 or if z < -1.96
Hypothesis Testing for m
4. Compute test statistic
X - μ 0 200.3  203
Z=
=
= 4.22
s/ n 36.8 / 3310
5. Conclusion. Reject H0 because -4.22 < 1.96. We have statistically significant
evidence at a=0.05 to show that the mean
total cholesterol is different in the
Framingham Heart Study participants.
Hypothesis Testing for m
Significance of the findings. Z = -4.22.
Table 1C. Critical Values for Two-Sided Tests
a
Z
0.20
1.282
0.10
1.645
0.05
1.960
0.010
2.576
0.001
3.291
0.0001
3.819
p<0.0001.
Interpreting P-Values
If p < a then reject H0
Errors in Hypothesis Tests
Conclusion of Statistical Test
Do Not Reject H0
Reject H0
H0 true
Correct
Type I error
H0 false
Type II error
Correct
Practice Example – Is social
networking a health risk?
Hypertexting (>120 text messages per
day) has been associated with health
risks (Frank et al, Nov 2010). In 2010,
the mean number of texts per day was
55. Is texting increasing in 2012? A
sample of 75 teens report sending a
mean of 61 texts per day (SD = 15). Is
there evidence of an increase in texting
in 2012?
Practice Example – Is social
networking a health risk?
1. H0: m = 55
H1: m > 55
a=0.05
2. Test statistic
Z=
X - μ0
s/ n
3. Decision rule
Reject H0 if z > 1.645
4. Test Statistic
61- 55
Z=
= 3.5
15/ 75
5. Reject H0.
In an upper tailed test with a=0.05.
If Z=-2.5 would you reject H0: m=50?
1. Yes
2. No
o
0%
N
Ye
s
0%
We run a test and do not reject H0.
Which is most likely…
correct decision
2. We committed a
Type I error
3. We committed a
Type II error
4. 1 or 2
5. 1 or 3
3
or
or
2
1
Ty
m
m
itt
e
d
a
Ty
a
e
co
d
W
1
pe
...
pe
...
rr
ec
t
co
W
e
co
m
m
itt
e
th
e
e
m
e
W
..
0% 0% 0% 0% 0%
New Scenario
 Outcome is dichotomous (p=population
proportion)
 Result of surgery (success, failure)
 Cancer remission (yes/no)
 One study sample
 Data
 On each participant, measure outcome
(yes/no)
x
 n, x=# positive responses, pˆ =
n
Hypothesis Testing for p
 Dichotomous outcome
 1 Sample
H0: p=p0
H1: p>p0, p<p0, p≠p0
Test Statistic
min[np0 , n(1 p0 )]  5
Z=
pˆ - p 0
p 0 (1 - p 0 )
n
(Find critical value in Table 1C)
Hypothesis Testing for p
The NCHS reports that the
prevalence of cigarette smoking
among adults in 2002 is 21.1%. Is
the prevalence of smoking lower
among participants in the
Framingham Heart Study?
In 3536 participants, 482 reported
smoking (482/3536=0.136).
Hypothesis Testing for p
1. H0: p=0.211
H1: p<0.211
2. Test statistic
a=0.05
Z=
pˆ - p 0
p 0 (1 - p 0 )
n
3. Decision rule
Reject H0 if z < -1.645
Hypothesis Testing for p
4. Compute test statistic
Z=
pˆ - p 0
=
p 0 (1- p 0 )
n
0.136 0.211
= 10.93
0.211(1  0.211)
3536
5. Conclusion. Reject H0 because -10.93 < -1.645.
We have statistically significant evidence at
a=0.05 to show that the prevalence of smoking
is lower among the Framingham Heart Study
participants. (p<0.0001)
Practice Example – Is social
networking a health risk?
Hypertexting (>120 text messages per
day) has been associated with health
risks (Frank et al, Nov 2010). In 2010,
19% of teens were hypertexting. Is
hypertexting increasing in 2012? In a
sample of 75 teens, 16 report sending
more than 120 texts per day. Is hypertexting increasing in 2012?
Practice Example – Is social
networking a health risk?
1. H0: p=0.19
H1: p > 0.19
Sample Data: n=75, x=16
16
= 75=0.21
a=0.05
2. Test statistic
pˆ - p 0
Z=
p 0 (1 - p 0 )
n
3. Decision rule
Reject H0 if z > 1.645
4. Test Statistic
Z=
0.21- 0.19
= 0.44
0.19(1- 0.19)
75
5. Do not reject H0.
New Scenario
 Outcome is continuous
 SBP, Weight, cholesterol
 Two independent study samples
 Data
 On each participant, identify group and
measure outcome
2
2
 n1 , X1 , s1 (or s1 ), n 2 , X2 , s2 (or s2 )
Two Independent Samples
RCT: Set of Subjects Who Meet
Study Eligibility Criteria
Randomize
Treatment 1
Mean Trt 1
Treatment 2
Mean Trt 2
Two Independent Samples
Cohort Study - Set of Subjects Who
Meet Study Inclusion Criteria
Group 1
Mean Group 1
Group 2
Mean Group 2
Hypothesis Testing for (m1m2)
 Continuous outcome
 2 Independent Sample
H0: m1=m2
(m1m2 = 0)
H1: m1>m2, m1<m2, m1≠m2
An RCT is planned to show the efficacy of
a new drug vs. placebo to lower total
cholesterol.
What are the hypotheses?
1. H0: mP=mN H1: mP>mN
2. H0: mP=mN H1: mP<mN
3. H0: mP=mN H1: mP≠mN
...
0%
H1
:
m
P=
m
N
0:
H
H
0:
m
P=
m
N
H1
:
m
...
H1
:
m
P=
m
N
0:
H
0%
...
0%
Hypothesis Testing for (m1m2)
Test Statistic
n1>30 and
n2> 30
n1<30 or
n2<30
Z=
t=
X1 - X 2
1
1
Sp

n1 n 2
X1 - X 2
1
1
Sp

n1 n 2
(Find critical value
in Table 1C,
Table 2)
Pooled Estimate of Common
Standard Deviation, Sp
 Previous formulas assume equal
variances (s12=s22)
 If 0.5 < s12/s22 < 2, assumption is
reasonable
Sp =
(n1  1)s  (n 2  1)s
n1  n 2  2
2
1
2
2
Hypothesis Testing for (m1m2)
A clinical trial is run to assess the
effectiveness of a new drug in lowering
cholesterol. Patients are randomized to
receive the new drug or placebo and
total cholesterol is measured after 6
weeks on the assigned treatment.
Is there evidence of a statistically
significant reduction in cholesterol for
patients on the new drug?
Hypothesis Testing for (m1m2)
New Drug
Placebo
Sample Size
15
15
Mean
195.9
217.4
Std Dev
28.7
30.3
Hypothesis Testing for (m1m2)
1. H0: m1=m2
H1: m1<m2
2. Test statistic
a=0.05
t=
X1 - X 2
1
1
Sp

n1 n 2
3. Decision rule, df=n1+n2-2 = 28
Reject H0 if t < -1.701
Assess Equality of Variances
 Ratio of sample variances: 28.72/30.32 =
0.90
Sp =
Sp =
(n1  1)s12  (n 2  1)s22
n1  n 2  2
(15 1)28.72  (15 1)30.32
15  15  2
= 870.89 = 29.5
Hypothesis Testing for (m1m2)
4. Compute test statistic
t=
X1 - X2
195.9  227.4
=
= 2.92
1 1
1 1
Sp

29.5

n1 n 2
15 15
5. Conclusion. Reject H0 because -2.92 <
-1.701. We have statistically significant evidence at
a=0.05 to show that the mean cholesterol level is
lower in patients on treatment as compared to
placebo. (p<0.005)
A two sided test for the equality of
means produces p=0.20. Reject H0?
1. Yes
2. No
3. Maybe
o
0%
N
Ye
s
0%
New Scenario
 Outcome is continuous
 SBP, Weight, cholesterol
 Two matched study samples
 Data
 On each participant, measure outcome
under each experimental condition
 Compute differences (D=X1-X2)
 n, Xd , sd
Two Dependent/Matched Samples
Subject ID
1
2
.
.
Measure 1
55
42
Measure 2
70
60
Measures taken serially in time or under
different experimental conditions
Crossover Trial
Treatment
Treatment
Placebo
Placebo
Eligible
R
Participants
Each participant measured on Treatment and placebo
Hypothesis Testing for md
 Continuous outcome
 2 Matched/Paired Sample
H0: md=0
H1: md>0, md<0, md≠0
Test Statistic
n>30
n<30
Z=
t=
Xd - μ d
sd
n
Xd - μ d
sd
n
(Find critical value
in Table 1C,
Table 2)
Hypothesis Testing for md
Is there a statistically significant difference
in mean systolic blood pressures (SBPs)
measured at exams 6 and 7 (approximately
4 years apart) in the Framingham Offspring
Study?
Among n=15 randomly selected
participants, the mean difference was -5.3
units and the standard deviation was 12.8
units. Differences were computed by
subtracting the exam 6 value from the
exam 7 value.
Hypothesis Testing for md
1. H0: md=0
H1: md≠0
a=0.05
2. Test statistic
t=
Xd - μ d
sd
n
3. Decision rule, df=n-1=14
Reject H0 if t > 2.145 or if t < -2.145
Hypothesis Testing for md
4. Compute test statistic
Xd - μ d
 5.3  0
t=
=
= 1.60
s d n 12.8 / 15
5. Conclusion. Do not reject H0 because -2.145
< -1.60 < 2.145. We do not have
statistically significant evidence at a=0.05 to
show that there is a difference in systolic
blood pressures over time.
New Scenario
 Outcome is dichotomous
 Result of surgery (success, failure)
 Cancer remission (yes/no)
 Two independent study samples
 Data
 On each participant, identify group and
measure outcome (yes/no)
ˆ 1 , n 2 , pˆ 2
 n1 , p
Hypothesis Testing for (p1-p2)
 Dichotomous outcome
 2 Independent Sample
H0: p1=p2
H1: p1>p2, p1<p2, p1≠p2
Test Statistic
min[n1pˆ 1 , n1 (1 pˆ 1 ), n 2 pˆ 2 , n 2 (1 pˆ 2 )]  5
Z=
pˆ 1 - pˆ 2
1
1 
ˆp(1- pˆ )  
 n1 n 2 
(Find critical value
in Table 1C)
Hypothesis Testing for (p1-p2)
Is the prevalence of CVD different in smokers as
compared to nonsmokers in the Framingham
Offspring Study?
Nonsmoker
Current smoker
Total
Free of
CVD
2757
History of Total
CVD
298
3055
663
81
744
3420
379
3799
Hypothesis Testing for (p1-p2)
1. H0: p1=p2
H1: p1≠p2
2. Test statistic
a=0.05
Z=
pˆ 1 - pˆ 2
 1
1 

pˆ (1 - pˆ ) 
 n1 n 2 
3. Decision rule
Reject H0 if Z < -1.96 or if Z > 1.96
Hypothesis Testing for (p1-p2)
4. Compute test statistic
Z=
Z=
pˆ 1 - pˆ 2
 1
1 

pˆ (1 - pˆ ) 
 n1 n 2 
pˆ 1 =
81
298
ˆ
= 0.1089, p 2 =
= 0.0975
744
3055
0.1089- 0.0975
1 
 1
0.0988(1- 0.0988)


 744 3055
pˆ =
81  298
= 0.0988
744  3055
= 0.927
Hypothesis Testing for (p1-p2)
5. Conclusion. Do not reject H0 because -1.96
< 0.927 < 1.96. We do not have statistically
significant evidence at a=0.05 to show that
there is a difference in prevalent CVD
between smokers and nonsmokers.
Study of Single verses Weekly
Antenatal Corticosteroids
 What do p-values mean in Table 1?
 What do p-values mean in Table 3?
Study of Single verses Weekly
Antenatal Corticosteroids
Did randomization
work?
Primary outcome
Is trial a success?
Practice Problem
Run Test for Primary Outcome
Composite Morbidity
Group
Weekly
Single
Sample Size
256
246
# Events
56
66
Solution
1. H0: p1=p2
H1: p1≠p2
2. Test statistic
a=0.05
Z=
pˆ 1 - pˆ 2
 1
1 

pˆ (1 - pˆ ) 
 n1 n 2 
3. Decision rule
Reject H0 if Z < -1.96 or if Z > 1.96
Solution
4. Compute test statistic
Z=
Z=
pˆ 1 - pˆ 2
 1
1 

pˆ (1 - pˆ ) 
 n1 n 2 
56
66
pˆ 1 =
= 0.219, pˆ 2 =
= 0.268
256
246
0.219- 0.268
1 
 1
0.243(1- 0.243)


256
246


pˆ =
56  66
= 0.243
256246
= 1.294
5. Do not Reject H0 since -1.96<-1.294<1.96.
```