An R Introduction to Statistics

Significance Test for Linear Regression

Assume that the error term ϵ in the linear regression model is independent of x, and is normally distributed, with zero mean and constant variance. We can decide whether there is any significant relationship between x and y by testing the null hypothesis that β = 0.

Problem

Decide whether there is a significant relationship between the variables in the linear regression model of the data set faithful at .05 significance level.

Solution

We apply the lm function to a formula that describes the variable eruptions by the variable waiting, and save the linear regression model in a new variable eruption.lm.

> eruption.lm = lm(eruptions ~ waiting, data=faithful)

Then we print out the F-statistics of the significance test with the summary function.

> summary(eruption.lm) 
 
Call: 
lm(formula = eruptions ~ waiting, data = faithful) 
 
Residuals: 
    Min      1Q  Median      3Q     Max 
-1.2992 -0.3769  0.0351  0.3491  1.1933 
 
Coefficients: 
            Estimate Std. Error t value Pr(>|t|) 
(Intercept) -1.87402    0.16014   -11.7   <2e-16 *** 
waiting      0.07563    0.00222    34.1   <2e-16 *** 
--- 
Signif. codes:  0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 
 
Residual standard error: 0.497 on 270 degrees of freedom 
Multiple R-squared: 0.811,      Adjusted R-squared: 0.811 
F-statistic: 1.16e+03 on 1 and 270 DF,  p-value: <2e-16

Answer

As the p-value is much less than 0.05, we reject the null hypothesis that β = 0. Hence there is a significant relationship between the variables in the linear regression model of the data set faithful.

Note

Further detail of the summary function for linear regression model can be found in the R documentation.

> help(summary.lm)