MSOR 221 Statistical Inference

Chapter 18 Multiple Regression

Model and Required Conditions

For k independent variables (predicting variables) x1, x2, … , xk, the multiple linear regression model is represented by the following equation:

[pic]

where (1, (2, … , (k are population regression coefficients of x1, x2, … , xk respectively, (0 is the constant term, and ( (the Greek letter epsilon) represents the random term (also called the error variable) – the difference between the actual value of Y and the estimated value of Y based on the values of the independent variables. The random term thus accounts for all other independent variables that are not included in the model.
55 |
| |3 |100 |100 |300 |10000 |300 |9 |10000 |95.92 |
| |10 |600 |400 |4000 |240000 |6000 |100 |360000 |379.61 |
| |33 |2800 |1500 |10300 |880000 |18800 |235 |1700000 |1500 |

n = 6

[pic]

By solving the above system of normal equations, we should find the following:

b0 = 6.397 b1 = 20.492 b2 = 0.280

( The sample multiple linear regression equation is:
[pic]

Interpretation of the Regression Coefficients

b1: the approximate change in y if x1 is increased by 1 unit and x2 is held constant.
b2: the approximate change in y if x2 is increased by 1 unit and x1 is held constant.

In Example 1, if x1 is increased by 1 unit and x2 is held constant, then the approximate change in y therefore will be 20.492 units.

Point Estimate

In Example 1, suppose x1 = 4 and x2 = 500, then the point estimate of y equals:
[pic]

The Standard Error of Estimate in Multiple Regression Model

[pic]

where [pic] = the observed y value in the sample
[pic] = the estimated y value calculated from the multiple regression equation

|In Example 1, |[pic] | | | |
| |(17.01)2 | |[pic] |
| |(-5.2)2 | | |
| |(5.27)2 | | |
| |(-41.55)2 | | |
| |(4.08)2 | | |
| |(20.39)2 | | |
| |2502.954 | | | |

Note: [pic] is the point estimate of [pic](the standard deviation of the error variable (.)

Testing the Validity of the Model – The Analysis of Variance (ANOVA) Test

Let’s consider a simple linear regression model:

y

* [pic]= (y / n = the mean of y

* *
[pic]
* *

x

[pic]
← [pic]

[pic] = total deviations
[pic] = total deviations of estimated values from the mean
[pic] = error deviations = [pic]
[pic] = the residual of the ith data point

[pic]
← SST = SSR + SSE
SST = total sum of squared deviations = total variation
SSR = sum of squares resulting from regression = explained variation
SSE = sum of squares...

