Report

IB Math Studies – Topic 6 IB Course Guide Description IB Course Guide Description IB Course Guide Description Continuous and Discrete Data Continuous: A numerical data which has values within a continuous range that has been measured. Discrete: A numerical data which has whole numbers and has been counted. Presenting and Interpreting Data Stem-and-leaf Plots Stem-and-leaf, or at times called stemplot, is a easy way of writing down the data in groups. Used for small data sets For number with two digits, the first digit forms part of the stem and the second digit forms a leaf Frequency Polygon • A line graph, utilized much like a histogram, that gives a visual appreciation of the shape of the frequency distribution. • The midpoint of each bar is used to represent the whole interval. • Lines are then draw between these midpoints. Histograms A histogram is a vertical column graph used to represent continuous grouped data. There are no gaps between the columns in a histograms as the data is continuous. Box and Whisker Plot A box-and-whisker plot is a visual display of some of the descriptive statistics of a data set. • Outliers are extraordinary data that are usually separated form the main body of the data. • The upper boundary = upper quartile + 1.5 X IQR • The lower boundary = lower quartile – 1.5 X IQR Summarizing the Data • Mean: is the arithmetic average obtained by adding all the scores and dividing by the total number of scores. • Mode: is the score that occurs most frequently. • Median: Is the middle score after they have been placed in order. Grouped Discrete Data Grouped Continuous Data Measure of Dispersion Range: is the difference between the maximum data value and the minimum data value. Range = maximum data value – minimum data value Interquartile Range: is the range of the middle half (50%) of the data. The data set has been divided into quarters by the lower quartile (Q1), the median (Q2) and the upper quartile (Q3). IQR = Q3 – Q1 Standard Deviation Standard Deviation: measures the deviation between scores and the mean. Ungrouped Data Grouped Discrete Data Correlation • A correlation refers to the relationship or association between two variables Line of Best Fit A scatter diagram indicates the relationship between two variables. If there is a relationship, we can draw in the “line of best fit” • Drawing a Line of Best Fit • Calculate mean of x values x , and mean of y values y • Mark the mean point x , y on the scatter plot • Draw a line through the mean point that is through the middle of the data • equal number of points above and below line Regression Line The line of best fit on a scatter diagram is called a “regression line” and it can be calculated from the data pairs. y y sxy sx 2 (x x) • The regression line is used for prediction purposes. • The regression line is less reliable when extended far beyond the region of the data. Correlation Coefficient • -1 indicates perfect negative correlation. • 0 indicates no correlation • +1 indicates perfect positive correlation. • 0.25 ≤ r < 0.5 = weak • 0.5 ≤ r < 0.75 = moderate correlation • 0.75 ≤ r <1 = strong correlation The Chi-Squared Test How many people are in the sample? How many males? How many females? This is called a 2 x 2 contingency table. 1) Write the null hypothesis (H0) and the alternate hypothesis (H1). 2) Create contingency tables for observed and expected values. 3) Calculate the chi-square statistic and degrees of freedom. 4) Find the chi-squared critical value (booklet). • Depends on the level of significance (p) and the degrees of freedom (v). 5) Determine whether or not to accept the null hypothesis. X 2 calc f f exp 2 obs fe On the calculator: Put your contingency table in matrix A STAT TESTS C: χ2 Test Observed: [A] Expected: [B] (this is where you want to go) Calculate