When dealing with tabulated datasets there are occasions when some of the values are missing. One of the features of statistical languages is that they can handle such situations.
In Julia, the DataFrames
package has been developed in order to treat such cases and this is the subject of this chapter.
The package extends the Julia base by adding three new types:
NA
is introduced in order to represent a missing value. This type only has one particular valueNA
.DataArray
is a type that emulates Julia's standardArray
type, but is able to store missing values in the array.DataFrame
is a type that is capable of representing tabular datasets such as those found in typical databases or spreadsheets. The concept of the data frame is most evident in R language and is one of the cornerstones of its popularity.
Except for its ability to store NA
values, the DataArray
type is meant to behave exactly as Julia's standard Array
type. In particular, DataArray...