Report

Introduction to the multiple linear regression model Regression models with more than one predictor (or term) Example 1 Is brain and body size predictive of intelligence? • Sample of n = 38 college students • Response (Y): intelligence based on PIQ (performance) scores from the (revised) Wechsler Adult Intelligence Scale. • Potential predictor (x1): Brain size based on MRI scans (given as count/10,000). • Potential predictor (x2): Height in inches. • Potential predictor (x3): Weight in pounds. Example 1 Scatter plot matrix 130.5 91.5 PIQ 100.728 MRI 86.283 73.25 Height 65.75 170.5 Weight 127.5 .5 .5 91 130 3 8 .28 0.72 6 8 10 .7 5 3 .2 5 65 7 7.5 70.5 12 1 Scatter plot matrix • Tells us about 2D marginal relationships between each pair of variables without regard to other variables. • The challenge is how the 2D relationships relate to how the response y depends on all 3 predictors simultaneously. Example 1 Marginal response plots 130.5 91.5 PIQ 100.728 MRI 86.283 73.25 Height 65.75 170.5 Weight 127.5 .5 .5 91 130 3 8 .28 0.72 6 8 10 .7 5 3 .2 5 65 7 7.5 70.5 12 1 Marginal response plots • Scatter plot of response y vs. each predictor. • Suggest how response y depends on each predictor without regard to other predictors. • Provide a visual lower bound for the goodness-of-fit that can be achieved by the full regression model. Example 1 A potential multiple linear regression model Yi 0 1 xi1 2 xi 2 3 xi 3 i where … • Yi is intelligence (PIQ) of student i • xi1 is brain size (MRI) of student i • xi2 is height (Height) of student i • xi3 is weight (Weight) of student i and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2. Example 1 Potential research questions • Which predictors explain some of the variation in PIQ? • What is the effect of brain size on PIQ? • What is the PIQ of an individual with a given brain size, height, and weight? Predictors • As before, the x variable. Also, called explanatory variables or independent variables. • Most often numerical measurements, such as age, weight, length, and temperature. • But, can be categorical, such as gender, race, and species. Terms Terms are functions of the predictor variables, such as: u1 x1 x2 u3 loge x2 u2 x u4 x1 2 1 Linear regression model as function of terms: Yi 0 1u1 2u2 3u3 4u4 i Yi 0 1 x1 x2 x 3 loge x2 4 x1 i 2 2 1 Types of terms • • • • • The predictors themselves. Powers of predictors. Transformations of predictors. Interactions. Binary (or categorical) predictors. Simple linear regression model with a transformed predictor Yi 0 1 log10 xi i where … • Yi is proportion of items correctly recalled for person i • xi is time since person i initially memorized the list and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2. Visualizing simple linear regression model with a transformed predictor Regression Plot prop = 0.846415 - 0.182427 log10time S = 0.0233881 R-Sq = 99.0 % R-Sq(adj) = 98.9 % 0.9 0.8 0.7 prop 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 1 2 log10time 3 4 A first order model with two predictors Yi 0 1 xi1 2 xi 2 i where … • Yi is life of power cell i (number of cycles) • xi1 is charge rate of power cell i (amperes) • xi2 is ambient temperature of power cell i (celsius) and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2. Visualizing a first order model with two predictors A first order model with more than 2 predictors Yi 0 1 xi1 2 xi 2 3 xi 3 i where … • Yi is intelligence (PIQ) of student i • xi1 is brain size (MRI) of student i • xi2 is height (Height) of student i • xi3 is weight (Weight) of student i and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2. Visualizing a first order model with more than 2 predictors A second order polynomial model with one predictor Yi 0 1 xi 11 x i 2 i where … • Yi is length of bluegill (fish) i (in mm) • xi is age of bluegill (fish) i (in years) and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2. Visualizing a second order polynomial model with one predictor Regression Plot length = 13.6224 + 54.0493 age - 4.71866 age**2 S = 10.9061 R-Sq = 80.1 % R-Sq(adj) = 79.6 % 200 length 150 100 1 2 3 4 age 5 6 A second order polynomial model with 2 predictors Yi 0 1 xi1 2 xi 2 11 xi21 22 xi22 12 xi1 xi 2 i where … • Yi is grade point average of student i • xi1 is verbal test score of student i • xi2 is math test score of student i and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2. Visualizing second order polynomial model with 2 predictors A first order model with one binary predictor Yi 0 1 xi1 2 xi 2 i where … • Yi is birth weight of baby i • xi1 is length of gestation of baby i • xi2 = 1, if mother smokes and xi2 = 0, if not and … the independent error terms i follow a normal distribution with mean 0 and equal variance 2. Visualizing a first order model with one binary predictor The regression equation is Weight = - 2390 + 143 Gest - 245 Smoking Weight (grams) 3700 0 1 3200 2700 2200 34 35 36 37 38 39 Gestation (weeks) 40 41 42