An R Introduction to Statistics

Kurtosis

The kurtosis of a univariate population is defined by the following formula, where μ2 and μ4 are the second and fourth central moments.

γ2 = μ4∕μ22 - 3

Intuitively, the kurtosis is a measure of the peakedness of the data distribution. Negative kurtosis would indicates a flat data distribution, which is said to be platykurtic. Positive kurtosis would indicates a peaked distribution, which is said to be leptokurtic. Incidentally, the normal distribution has zero kurtosis, and is said to be mesokurtic.

Problem

Find the kurtosis of eruption duration in the data set faithful.

Solution

We apply the function kurtosis from the e1071 package to compute the kurtosis of eruptions. As the package is not in the core R library, it has to be installed and loaded into the R workspace.

> library(e1071)                    # load e1071 
> duration = faithful$eruptions     # eruption durations 
> kurtosis(duration)                # apply the kurtosis function 
[1] -1.5116

Answer

The kurtosis of eruption duration is -1.5116, which indicates that eruption duration distribution is platykurtic. This is consistent with the fact that its histogram is not bell-shaped.

Exercise

Find the kurtosis of eruption waiting period in faithful.

Note

The default algorithm of the function kurtosis in e1071 is based on the formula g2 = m4∕s4 - 3, where m4 and s are the fourth central moment and sample standard deviation respectively. See the R documentation for selecting other types of kurtosis algorithm.

> library(e1071)                    # load e1071 
> help(kurtosis)