Regression Analysis
In regression analysis we analyze the relationship between two or
more variables.
The relationship between two or more variables could be linear or
non linear.
This week first talk about the simplest case. Simple Linear
Regression : Linear Regression Between Two Variables
How we could use available data to investigate such a relationship?
How could we use this relationship to forecast future.
While our interest it to investigate the relationship between demand
(y) and time (x). But the concept is general, for example,
advertising could be the independent variable and sales to be the
dependent variable.
Simple Linear Relationship
Linear relationship between two variables is stated as
y = b0 + b1 x
This is the general equation for a line
b0 : Intersection with y axis
b1 : The slope
x : The independent variable
y : The dependent variable
b1 > 0
b1 < 0
b1 = 0
Scatter Diagram
Graphical - Judgmental Solution
Graphical - Judgmental Solution
Graphical - Judgmental Solution
SSE : Pictorial Representation
y10 - ^y10
SST : Pictorial Representation
y10 - y
SST : A measure of how well the observations cluster around y
SSE : A measure of how well the observations cluster around ŷ
If x did not play any role in vale of y then we should
If x plays the full role in vale of y then
SSE = 0
SSR : Sum of the squares due to regression
SSR is explained portion of SST
SSE is unexplained portion of SST
Coefficient of Determination for Goodness of Fit
The largest value for SSE is
SSE = SST =======> SSR = 0
SSR/SST = 0 =====> the worst fit
SSR/SST = 1 =====> the best fit
Coefficient of Determination for Pizza example
In the Pizza example,
SST = 15730
SSE = 1530
SSR = 15730 - 1530 = 14200
r2 = SSR/SST : Coefficient of Determination
1  r2  0
r2 = 14200/15730 = .9027
In other words, 90% of variations in y can be explained by the
regression line.
Example : Read Auto Sales
• Coefficient of Determination
r 2 = SSR/SST = 100/114 = .88
The regression relationship is very strong since
88% of the variation in number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.
The Correlation Coefficient
Correlation Coefficient = Sign of b1 times Square Root of the
Coefficient of Determination)
rxy  ( sign of b1 ) r 2
Correlation coefficient is a measure of the strength of a linear
association between two variables. It has a value between -1
and +1
Rxy = +1 : two variables are perfectly related through a line with
positive slope.
Rxy = -1 : two variables are perfectly related through a line with
negative slope.
Rxy = 0 : two variables are not linearly related.
Correlation Coefficient and Coefficient of Determination
Coefficient of Determination and Correlation Coefficient are both
measures of associations between variables.
Correlation Coefficient for linear relationship between two
Coefficient of Determination for linear and nonlinear
relationships between two and more variables.

similar documents