A data frame is used for storing data tables. It is a list of vectors of equal length. For example, the following variable df is a data frame containing three vectors n, s, b.
> s = c("aa", "bb", "cc")
> b = c(TRUE, FALSE, TRUE)
> df = data.frame(n, s, b) # df is a data frame
We use built-in data frames in R for our tutorials. For example, here is a built-in data frame in R, called mtcars.
mpg cyl disp hp drat wt ...
Mazda RX4 21.0 6 160 110 3.90 2.62 ...
Mazda RX4 Wag 21.0 6 160 110 3.90 2.88 ...
Datsun 710 22.8 4 108 93 3.85 2.32 ...
The top line of the table, called the header, contains the column names. Each horizontal line afterward denotes a data row, which begins with the name of the row, and then followed by the actual data. Each data member of a row is called a cell.
To retrieve data in a cell, we would enter its row and column coordinates in the single square bracket "" operator. The two coordinates are separated by a comma. In other words, the coordinates begins with row position, then followed by a comma, and ends with the column position. The order is important.
Here is the cell value from the first row, second column of mtcars.
Moreover, we can use the row and column names instead of the numeric coordinates.
Lastly, the number of data rows in the data frame is given by the nrow function.
And the number of columns of a data frame is given by the ncol function.
Further details of the mtcars data set is available in the R documentation.
Instead of printing out the entire data frame, it is often desirable to preview it with the head function beforehand.