Loading data
In Rattle, you have to explicitly declare the role of each variable. A variable can have five different roles:
Input: The prediction process will use input variables to predict the value of the target variable.
Target: The target variable is the output of our model.
Risk: The risk variable is a measure of the target variable.
Ident or Identifier: An identifier is a variable that identifies a unique occurrence of an object. In our preceding example, the variable Person is an identifier that identifies a unique person.
Ignore: A variable marked Ignore will be ignored by the model. We'll come back to this role later-some variables can create noise and decrease the performance of your predictive model.
Rattle can load data from many data sources. Here are some options:
Use the Spreadsheet option to load data from a Comma Separated Value (CSV) file.
Open Database Connectivity (ODBC) is a standard to define database connectivity. Using this standard, you can load from most common databases...