An R Introduction to Statistics

Confidence Interval for Linear Regression

Assume that the error term ϵ in the linear regression model is independent of x, and is normally distributed, with zero mean and constant variance. For a given value of x, the interval estimate for the mean of the dependent variable, ¯y , is called the confidence interval.

Problem

In the data set faithful, develop a 95% confidence interval of the mean eruption duration for the waiting time of 80 minutes.

Solution

We apply the lm function to a formula that describes the variable eruptions by the variable waiting, and save the linear regression model in a new variable eruption.lm.

> attach(faithful)     # attach the data frame 
> eruption.lm = lm(eruptions ~ waiting)

Then we create a new data frame that set the waiting time value.

> newdata = data.frame(waiting=80)

We now apply the predict function and set the predictor variable in the newdata argument. We also set the interval type as "confidence", and use the default 0.95 confidence level.

> predict(eruption.lm, newdata, interval="confidence") 
     fit    lwr    upr 
1 4.1762 4.1048 4.2476 
> detach(faithful)     # clean up

Answer

The 95% confidence interval of the mean eruption duration for the waiting time of 80 minutes is between 4.1048 and 4.2476 minutes.

Note

Further detail of the predict function for linear regression model can be found in the R documentation.

> help(predict.lm)