An R Introduction to Statistics

Qualitative Data

fractal-07h A data sample is called qualitative, also known as categorical, if its values belong to a collection of known defined non-overlapping classes. Common examples include student letter grade (A, B, C, D or F), commercial bond rating (AAA, AAB, ...) and consumer clothing shoe sizes (1, 2, 3, ...).

The tutorials in this section are based on an R built-in data frame named painters. It is a compilation of technical information of a few eighteenth century classical painters. The data set belongs to the MASS package, and has to be pre-loaded into the R workspace prior to its use.

> library(MASS)      # load the MASS package 
> painters 
              Composition Drawing Colour Expression School 
Da Udine               10       8     16          3      A 
Da Vinci               15      16      4         14      A 
Del Piombo              8      13     16          7      A 
Del Sarto              12      16      9          8      A 
Fr. Penni               0      15      8          0      A 
Guilio Romano          15      16      4         14      A 
                    .................

The last School column contains the information of school classification of the painters. The schools are named as A, B, ..., etc, and the School variable is qualitative.

> painters$School 
 [1] A A A A A A A A A A B B B B B B C C C C C C D D D D 
[27] D D D D D D E E E E E E E F F F F G G G G G G G H H 
[53] H H 
Levels: A B C D E F G H

For further details of the painters data set, please consult the R documentation.

> help(painters)