An R Introduction to Statistics

Relative Frequency Distribution of Quantitative Data

The relative frequency distribution of a data variable is a summary of the frequency proportion in a collection of non-overlapping categories.

The relationship of frequency and relative frequency is:

Relative F requency =-Frequency-
                    Sample Size

Example

In the data set faithful, the relative frequency distribution of the eruptions variable shows the frequency proportion of the eruptions according to a duration classification.

Problem

Find the relative frequency distribution of the eruption durations in faithful.

Solution

We first find the frequency distribution of the eruption durations as follows. Further details can be found in the Frequency Distribution tutorial.

> duration = faithful$eruptions 
> breaks = seq(1.5, 5.5, by=0.5) 
> duration.cut = cut(duration, breaks, right=FALSE) 
> duration.freq = table(duration.cut)

Then we find the sample size of faithful with the nrow function, and divide the frequency distribution with it. As a result, the relative frequency distribution is:

> duration.relfreq = duration.freq / nrow(faithful)

Answer

The frequency distribution of the eruption variable is:

> duration.relfreq 
duration.cut 
 [1.5,2)  [2,2.5)  [2.5,3)  [3,3.5)  [3.5,4)  [4,4.5) 
0.187500 0.150735 0.018382 0.025735 0.110294 0.268382 
 [4.5,5)  [5,5.5) 
0.224265 0.014706

Enhanced Solution

We can print with fewer digits and make it more readable by setting the digits option.

> old = options(digits=1) 
> duration.relfreq 
duration.cut 
[1.5,2) [2,2.5) [2.5,3) [3,3.5) [3.5,4) [4,4.5) [4.5,5) 
   0.19    0.15    0.02    0.03    0.11    0.27    0.22 
[5,5.5) 
   0.01 
> options(old)    # restore the old option

We then apply the cbind function to print both the frequency distribution and relative frequency distribution in parallel columns.

> old = options(digits=1) 
> cbind(duration.freq, duration.relfreq) 
        duration.freq duration.relfreq 
[1.5,2)            51             0.19 
[2,2.5)            41             0.15 
[2.5,3)             5             0.02 
[3,3.5)             7             0.03 
[3.5,4)            30             0.11 
[4,4.5)            73             0.27 
[4.5,5)            61             0.22 
[5,5.5)             4             0.01 
> options(old)    # restore the old option

Exercise

Find the relative frequency distribution of the eruption waiting periods in faithful.