An R Introduction to Statistics

Estimated Multiple Regression Equation

If we choose the parameters α and βk (k = 1, 2, ..., p) in the multiple linear regression model so as to minimize the sum of squares of the error term ϵ, we will have the so called estimated multiple regression equation. It allows us to compute fitted values of y based on a set of values of xk (k = 1, 2, ..., p) .

       ∑
ˆy = a +   bkxk
        k

Problem

Apply the multiple linear regression model for the data set stackloss, and predict the stack loss if the air flow is 72, water temperature is 20 and acid concentration is 85.

Solution

We apply the lm function to a formula that describes the variable stack.loss by the variables Air.Flow, Water.Temp and Acid.Conc. And we save the linear regression model in a new variable stackloss.lm.

> stackloss.lm = lm(stack.loss ~ 
+     Air.Flow + Water.Temp + Acid.Conc., 
+     data=stackloss)

We also wrap the parameters inside a new data frame named newdata.

> newdata = data.frame(Air.Flow=72,  # wrap the parameters 
+     Water.Temp=20, 
+     Acid.Conc.=85)

Lastly, we apply the predict function to stackloss.lm and newdata.

> predict(stackloss.lm, newdata) 
     1 
24.582

Answer

Based on the multiple linear regression model and the given parameters, the predicted stack loss is 24.582.