Review Topic 6 PowerPoint I

IB Math Studies – Topic 6
IB Course Guide Description
IB Course Guide Description
IB Course Guide Description
Continuous and Discrete Data
 Continuous: A numerical data which has values within
a continuous range that has been measured.
 Discrete: A numerical data which has whole numbers
and has been counted.
Presenting and Interpreting Data
Stem-and-leaf Plots
 Stem-and-leaf, or at times called
stemplot, is a easy way of
writing down the data in
 Used for small data sets
 For number with two digits, the
first digit forms part of the stem
and the second digit forms a
Frequency Polygon
• A line graph, utilized much like a histogram, that gives a visual
 appreciation of the shape of the frequency distribution.
• The midpoint of each bar is used to represent the whole interval.
• Lines are then draw between these midpoints.
 A histogram is a vertical column graph used to represent
continuous grouped data.
 There are no gaps between the columns in a histograms as the
data is continuous.
Box and Whisker Plot
 A box-and-whisker plot is a visual display of some of
the descriptive statistics of a data set.
• Outliers are extraordinary data that
are usually separated form the main
body of the data.
• The upper boundary = upper
quartile + 1.5 X IQR
• The lower boundary = lower
quartile – 1.5 X IQR
Summarizing the Data
• Mean: is the arithmetic
average obtained by adding
all the scores and dividing by
the total number of scores.
• Mode: is the score that
occurs most frequently.
• Median: Is the middle score
after they have been placed
in order.
Grouped Discrete Data
Grouped Continuous Data
Measure of Dispersion
 Range: is the difference between the maximum data value
and the minimum data value.
Range = maximum data value – minimum data value
 Interquartile Range: is the range of the middle half (50%) of
the data.
 The data set has been divided into quarters by the lower
quartile (Q1), the median (Q2) and the upper quartile
IQR = Q3 – Q1
Standard Deviation
 Standard Deviation: measures the deviation between
scores and the mean.
 Ungrouped Data
 Grouped Discrete Data
• A correlation refers to the relationship or association between two variables
Line of Best Fit
 A scatter diagram indicates the relationship between two variables.
 If there is a relationship, we can draw in the “line of best fit”
• Drawing a Line of Best Fit
• Calculate mean of x values x , and
mean of y values y
• Mark the mean point  x , y  on
the scatter plot
• Draw a line through the mean point
that is through the middle of the
• equal number of points above
and below line
Regression Line
 The line of best fit on a scatter diagram is called a “regression line”
and it can be calculated from the data pairs.
y y 
(x  x)
• The regression line is used for prediction purposes.
• The regression line is less reliable when extended far beyond
the region of the data.
Correlation Coefficient
• -1 indicates perfect
negative correlation.
• 0 indicates no
• +1 indicates perfect
positive correlation.
• 0.25 ≤ r < 0.5 = weak
• 0.5 ≤ r < 0.75 =
moderate correlation
• 0.75 ≤ r <1 = strong
The Chi-Squared Test
 How many people are in the
 How many males?
 How many females?
 This is called a 2 x 2
contingency table.
1) Write the null hypothesis (H0) and the alternate
hypothesis (H1).
2) Create contingency tables for observed and
expected values.
3) Calculate the chi-square statistic and degrees
of freedom.
4) Find the chi-squared critical value (booklet).
Depends on the level of significance (p) and the degrees
of freedom (v).
5) Determine whether or not to accept the null
 f exp 
On the calculator:
Put your contingency table in matrix A
 C: χ2 Test
 Observed: [A]
 Expected: [B] (this is where you want to go)
 Calculate

similar documents