An R Introduction to Statistics

Estimated Logistic Regression Equation

Using the generalized linear model, an estimated logistic regression equation can be formulated as below. The coefficients a and bk (k = 1, 2, ..., p) are determined according to a maximum likelihood approach, and it allows us to estimate the probability of the dependent variable y taking on the value 1 for given values of xk (k = 1, 2, ..., p).

Estimate of P (y = 1 | x1,...xp) = 1∕(1+ e-(a+ kbkxk))


By use of the logistic regression equation of vehicle transmission in the data set mtcars, estimate the probability of a vehicle being fitted with a manual transmission if it has a 120hp engine and weights 2800 lbs.


We apply the function glm to a formula that describes the transmission type (am) by the horsepower (hp) and weight (wt). This creates a generalized linear model (GLM) in the binomial family.

> am.glm = glm(formula=am ~ hp + wt, 
+              data=mtcars, 
+              family=binomial)

We then wrap the test parameters inside a data frame newdata.

> newdata = data.frame(hp=120, wt=2.8)

Now we apply the function predict to the generalized linear model am.glm along with newdata. We will have to select response prediction type in order to obtain the predicted probability.

> predict(am.glm, newdata, type="response") 


For an automobile with 120hp engine and 2800 lbs weight, the probability of it being fitted with a manual transmission is about 64%.


Further detail of the function predict for generalized linear model can be found in the R documentation.

> help(predict.glm)