PowerPoint for Section 2.8

Report
Assignments 2.5-2.6
Due Friday 14, 2014
Assignments 2.7-2.8
Chapter 2 Review Quiz
Chapter 2 Review Homework*
Due Wednesday 19, 2014
Unit 1 Test (Over Ch. 1 and Ch. 2)
Thursday 20, 2014
Sta220 - Statistics
Mr. Smith
Room 310
Class #6
Section 2.8
Lesson Objectives
You will be able to:
1. Determine and interpret the interquartile
range (2.8)
2. Draw and interpret Boxplots (2.8)
3. Check a set of data for outliers (2.8)
1-4
Lesson Objective #1: Determine and interpret
the interquartile range (2.8)
An observation that is unusually large or small
relative to the data values we want to describe
is called an outlier.
Outliers typically are attributable to one of
several causes. Here are a few examples:
– The measurement is observed, recorded, or
entered into the computer incorrectly.
– The measurement comes from a different
population.
– The measurement is correct, but represents a rare
(chance) event.
Two useful methods for detecting outliers, one
graphical and one numerical, are box plots and
z-scores.
The boxplot is based on the quartiles of data set.
A box plot is based on the interquartile range
(IQR) – the distance between the lower and
upper quartiles.
 =  – 
What IQR tells us is that the ‘middle’ 50% of the
observations fall inside these two quartiles.
PREVIOUS EXAMPLE
Determining and Interpreting the
Interquartile Range
Determine and interpret the interquartile range of the
speed data.
Q1 = 28
Q3 = 38
IQR  Q3  Q1
 38  28
 10
The range of the middle 50% of the speed of cars
traveling through the construction zone is 10 miles per
hour.
3-10
Lesson Objective #2 Draw and interpret Boxplots
(2.8)
Once we have the IQR, then we can construct
two sets of limits, called inner fence and outer
fence. The inner fences are called the whiskers
of the boxplot, vertical lines away from the box.
Values that are beyond the inner fences are
deemed potential outliers.
As for the outer fences, they are called
imaginary fences and are marked with *.
Reasons for a Boxplot
• Helps identify outliers in a data set
• Helps give evidence for the shaped of a
data set
• Useful when comparing multiple data
sets
3-16
Boxplots to Determine Shape
3-17
StatCrunch
Consider the following horizontal box plot:
a. What is the median of the data set (approximately)? 4
b. What are the upper and lower quartiles of the data set (approximately)? 3; 6
c. What is the interquartile range of the data set (approximately)? IQR= 3
d. Is the data set skewed to the left, skewed to the right, or symmetric? Right
e. What percentage of the measurements in the data set lie to the right of the
median? To the left of the upper quartile? 50%, 75%
f. Identify any outliers in the data? 12;13;16
EXAMPLE
Obtaining the Five-Number Summary
Every six months, the United States Federal Reserve
Board conducts a survey of credit card plans in the U.S.
The following data are the interest rates charged by 10
credit card issuers randomly selected for the July 2005
survey. Determine the five-number summary of the data.
3-20
EXAMPLE
Obtaining the Five-Number Summary
Institution
Rate
Pulaski Bank and Trust Company
6.5%
Rainier Pacific Savings Bank
12.0%
Wells Fargo Bank NA
14.4%
Firstbank of Colorado
14.4%
Lafayette Ambassador Bank
14.3%
Infibank
13.0%
United Bank, Inc.
13.3%
First National Bank of The Mid-Cities
13.9%
Bank of Louisiana
The smallest number is
6.5%. The largest number is
14.5%. The first quartile is
12.0%. The second quartile
is 13.6%. The third quartile
is 14.4%.
9.9%
Bar Harbor Bank and Trust Company
14.5%
Five-number Summary:
6.5% 12.0% 13.6% 14.4% 14.5%
Source:
http://www.federalreserve.gov/pubs/SHOP/survey.htm
3-21
Comparing Groups
• Boxplots offer an ideal balance of information and simplicity,
hiding the details while displaying the overall summary
information.
• We often plot them side by side for groups or categories we
wish to compare.
Lesson Objective #3 Check a set of data for
outliers (2.8)
Copyright © 2013 Pearson Education, Inc.. All
rights reserved.
Example:
Suppose a female bank employee believes that
her salary is low as a result of sex discrimination.
To substantiate her belief, she collects
information on the salaries of her male
counterparts in the banking business. She finds
that their salaries have a mean of $64,000 and a
standard deviation of $2,000. Her salary is
$57,000. Does this information support her
claim of sex discrimination?
Solution
First, calculate the z-score for woman’s salary
with respect to those of her male counterparts.
Thus,
$57,000 − $64,000
=
= −3.5
$2,000
This implication is that the woman’s salary is 3.5
standard deviation BELOW the mean of the
male salary distribution.
Copyright © 2013 Pearson Education, Inc.. All
rights reserved.
Clearly, a z-score of -3.5 represents an outlier.
Either her salary is from a distribution different
from the male salary distribution, or it is very
unusual (highly improvable) measurement from
a salary distribution no different from the male
distribution.
Statistical thinking would lead us to conclude
that her salary does not come from the male
salary distribution, leading support to the
female bank employee's claim of sex
discrimination.
However, the careful investigator should require
more information before inferring that sex
discrimination is the cause.

similar documents