Regression Analysis In regression analysis we analyze the relationship between two or more variables. The relationship between two or more variables could be linear or non linear. This week first talk about the simplest case. Simple Linear Regression : Linear Regression Between Two Variables How we could use available data to investigate such a relationship? How could we use this relationship to forecast future. While our interest it to investigate the relationship between demand (y) and time (x). But the concept is general, for example, advertising could be the independent variable and sales to be the dependent variable. Simple Linear Relationship Linear relationship between two variables is stated as y = b0 + b1 x This is the general equation for a line b0 : Intersection with y axis b1 : The slope x : The independent variable y : The dependent variable b1 > 0 b1 < 0 b1 = 0 Scatter Diagram Graphical - Judgmental Solution 1 b0 b1 Graphical - Judgmental Solution Graphical - Judgmental Solution SSE : Pictorial Representation y10 - ^y10 SST : Pictorial Representation y10 - y yi SSE , SST and SSR SST : A measure of how well the observations cluster around y SSE : A measure of how well the observations cluster around ŷ If x did not play any role in vale of y then we should SST = SSE If x plays the full role in vale of y then SSE = 0 SST = SSE + SSR SSR : Sum of the squares due to regression SSR is explained portion of SST SSE is unexplained portion of SST Coefficient of Determination for Goodness of Fit SSE = SST - SSR The largest value for SSE is SSE = SST SSE = SST =======> SSR = 0 SSR/SST = 0 =====> the worst fit SSR/SST = 1 =====> the best fit Coefficient of Determination for Pizza example In the Pizza example, SST = 15730 SSE = 1530 SSR = 15730 - 1530 = 14200 r2 = SSR/SST : Coefficient of Determination 1 r2 0 r2 = 14200/15730 = .9027 In other words, 90% of variations in y can be explained by the regression line. Example : Read Auto Sales • Coefficient of Determination r 2 = SSR/SST = 100/114 = .88 The regression relationship is very strong since 88% of the variation in number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold. The Correlation Coefficient Correlation Coefficient = Sign of b1 times Square Root of the Coefficient of Determination) rxy ( sign of b1 ) r 2 Correlation coefficient is a measure of the strength of a linear association between two variables. It has a value between -1 and +1 Rxy = +1 : two variables are perfectly related through a line with positive slope. Rxy = -1 : two variables are perfectly related through a line with negative slope. Rxy = 0 : two variables are not linearly related. Correlation Coefficient and Coefficient of Determination Coefficient of Determination and Correlation Coefficient are both measures of associations between variables. Correlation Coefficient for linear relationship between two variables. Coefficient of Determination for linear and nonlinear relationships between two and more variables.