# Interval Estimate of Population Mean with Unknown Variance

After we found a point estimate of the population mean, we would need a way to quantify its accuracy. Here, we discuss the case where the population variance is not assumed.

Let us denote the 100(1 α∕2) percentile of the Student t distribution with n1 degrees of freedom as tα∕2. For random samples of sufficiently large size, and with standard deviation s, the end points of the interval estimate at (1 α) confidence level is given as follows:

#### Problem

Without assuming the population standard deviation of the student height in survey, find the margin of error and interval estimate at 95% confidence level.

#### Solution

We first filter out missing values in survey\$Height with the na.omit function, and save it in height.response.

> library(MASS)                  # load the MASS package
> height.response = na.omit(survey\$Height)

Then we compute the sample standard deviation.

> n = length(height.response)
> s = sd(height.response)        # sample standard deviation
> SE = s/sqrt(n); SE             # standard error estimate
[1] 0.68117

Since there are two tails of the Student t distribution, the 95% confidence level would imply the 97.5th percentile of the Student t distribution at the upper tail. Therefore, tα∕2 is given by qt(.975, df=n-1). We multiply it with the standard error estimate SE and get the margin of error.

> E = qt(.975, df=n1)SE; E     # margin of error
[1] 1.3429

We then add it up with the sample mean, and find the confidence interval.

> xbar = mean(height.response)   # sample mean
> xbar + c(E, E)
[1] 171.04 173.72

Without assumption on the population standard deviation, the margin of error for the student height survey at 95% confidence level is 1.3429 centimeters. The confidence interval is between 171.04 and 173.72 centimeters.

#### Alternative Solution

Instead of using the textbook formula, we can apply the t.test function in the built-in stats package.

> t.test(height.response)

One Sample ttest

data:  height.response
t = 253.07, df = 208, pvalue < 2.2e16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
171.04 173.72
sample estimates:
mean of x
172.38