An R Introduction to Statistics

Kurtosis

The kurtosis of a univariate population is defined by the following formula, where μ2 and μ4 are respectively the second and fourth central moments.

γ2 = μ4∕μ22 - 3

Intuitively, the kurtosis describes the tail shape of the data distribution. The normal distribution has zero kurtosis and thus the standard tail shape. It is said to be mesokurtic. Negative kurtosis would indicate a thin-tailed data distribution, and is said to be platykurtic. Positive kurtosis would indicate a fat-tailed distribution, and is said to be leptokurtic.

Problem

Find the kurtosis of eruption duration in the data set faithful.

Solution

We apply the function kurtosis from the e1071 package to compute the kurtosis of eruptions. As the package is not in the core R library, it has to be installed and loaded into the R workspace.

> library(e1071)                    # load e1071 
> duration = faithful$eruptions     # eruption durations 
> kurtosis(duration)                # apply the kurtosis function 
[1] -1.5116

Answer

The kurtosis of eruption duration is -1.5116, which indicates that eruption duration distribution is platykurtic. This is consistent with the fact that its histogram is not bell-shaped.

Exercise

Find the kurtosis of eruption waiting period in faithful.

Note

The default algorithm of the function kurtosis in e1071 is based on the formula g2 = m4∕s4 - 3, where m4 and s are the fourth central moment and sample standard deviation respectively. See the R documentation for selecting other types of kurtosis algorithm.

> library(e1071)                    # load e1071 
> help(kurtosis)