Assignments 2.5-2.6 Due Friday 14, 2014 Assignments 2.7-2.8 Chapter 2 Review Quiz Chapter 2 Review Homework* Due Wednesday 19, 2014 Unit 1 Test (Over Ch. 1 and Ch. 2) Thursday 20, 2014 Sta220 - Statistics Mr. Smith Room 310 Class #6 Section 2.8 Lesson Objectives You will be able to: 1. Determine and interpret the interquartile range (2.8) 2. Draw and interpret Boxplots (2.8) 3. Check a set of data for outliers (2.8) 1-4 Lesson Objective #1: Determine and interpret the interquartile range (2.8) An observation that is unusually large or small relative to the data values we want to describe is called an outlier. Outliers typically are attributable to one of several causes. Here are a few examples: – The measurement is observed, recorded, or entered into the computer incorrectly. – The measurement comes from a different population. – The measurement is correct, but represents a rare (chance) event. Two useful methods for detecting outliers, one graphical and one numerical, are box plots and z-scores. The boxplot is based on the quartiles of data set. A box plot is based on the interquartile range (IQR) – the distance between the lower and upper quartiles. = – What IQR tells us is that the ‘middle’ 50% of the observations fall inside these two quartiles. PREVIOUS EXAMPLE Determining and Interpreting the Interquartile Range Determine and interpret the interquartile range of the speed data. Q1 = 28 Q3 = 38 IQR Q3 Q1 38 28 10 The range of the middle 50% of the speed of cars traveling through the construction zone is 10 miles per hour. 3-10 Lesson Objective #2 Draw and interpret Boxplots (2.8) Once we have the IQR, then we can construct two sets of limits, called inner fence and outer fence. The inner fences are called the whiskers of the boxplot, vertical lines away from the box. Values that are beyond the inner fences are deemed potential outliers. As for the outer fences, they are called imaginary fences and are marked with *. Reasons for a Boxplot • Helps identify outliers in a data set • Helps give evidence for the shaped of a data set • Useful when comparing multiple data sets 3-16 Boxplots to Determine Shape 3-17 StatCrunch Consider the following horizontal box plot: a. What is the median of the data set (approximately)? 4 b. What are the upper and lower quartiles of the data set (approximately)? 3; 6 c. What is the interquartile range of the data set (approximately)? IQR= 3 d. Is the data set skewed to the left, skewed to the right, or symmetric? Right e. What percentage of the measurements in the data set lie to the right of the median? To the left of the upper quartile? 50%, 75% f. Identify any outliers in the data? 12;13;16 EXAMPLE Obtaining the Five-Number Summary Every six months, the United States Federal Reserve Board conducts a survey of credit card plans in the U.S. The following data are the interest rates charged by 10 credit card issuers randomly selected for the July 2005 survey. Determine the five-number summary of the data. 3-20 EXAMPLE Obtaining the Five-Number Summary Institution Rate Pulaski Bank and Trust Company 6.5% Rainier Pacific Savings Bank 12.0% Wells Fargo Bank NA 14.4% Firstbank of Colorado 14.4% Lafayette Ambassador Bank 14.3% Infibank 13.0% United Bank, Inc. 13.3% First National Bank of The Mid-Cities 13.9% Bank of Louisiana The smallest number is 6.5%. The largest number is 14.5%. The first quartile is 12.0%. The second quartile is 13.6%. The third quartile is 14.4%. 9.9% Bar Harbor Bank and Trust Company 14.5% Five-number Summary: 6.5% 12.0% 13.6% 14.4% 14.5% Source: http://www.federalreserve.gov/pubs/SHOP/survey.htm 3-21 Comparing Groups • Boxplots offer an ideal balance of information and simplicity, hiding the details while displaying the overall summary information. • We often plot them side by side for groups or categories we wish to compare. Lesson Objective #3 Check a set of data for outliers (2.8) Copyright © 2013 Pearson Education, Inc.. All rights reserved. Example: Suppose a female bank employee believes that her salary is low as a result of sex discrimination. To substantiate her belief, she collects information on the salaries of her male counterparts in the banking business. She finds that their salaries have a mean of $64,000 and a standard deviation of $2,000. Her salary is $57,000. Does this information support her claim of sex discrimination? Solution First, calculate the z-score for woman’s salary with respect to those of her male counterparts. Thus, $57,000 − $64,000 = = −3.5 $2,000 This implication is that the woman’s salary is 3.5 standard deviation BELOW the mean of the male salary distribution. Copyright © 2013 Pearson Education, Inc.. All rights reserved. Clearly, a z-score of -3.5 represents an outlier. Either her salary is from a distribution different from the male salary distribution, or it is very unusual (highly improvable) measurement from a salary distribution no different from the male distribution. Statistical thinking would lead us to conclude that her salary does not come from the male salary distribution, leading support to the female bank employee's claim of sex discrimination. However, the careful investigator should require more information before inferring that sex discrimination is the cause.