Datasource
Datasource is a term used for all the technology related to the extraction and storage of data. A datasource can be anything from a simple text file to a big database. The raw data can come from observation logs, sensors, transactions, or user's behavior.
In this section we will take a look into the most common forms for datasource and datasets.
A dataset is a collection of data, usually presented in tabular form. Each column represents a particular variable, and each row corresponds to a given member of the data, as is shown in the following figure:

A dataset represents a physical implementation of a datasource; the common features of a dataset are as follows:
- Dataset characteristics (such as multivariate or univariate)
- Number of instances
- Area (for example life, business, and so on)
- Attribute characteristics (namely, real, categorical, and nominal)
- Number of attributes
- Associated tasks (such as classification or clustering)
- Missing Values
Open data
Open data is data that can be used...