Encoding Categorical Variables
Categorical variables are those whose values are selected from a group of categories or labels. For example, the Home owner variable with the values of owner and non-owner is categorical, and so is the Marital status variable with the values of never married, married, divorced, and widowed. In some categorical variables, the labels have an intrinsic order; for example, in the Student's grade variable, the values of A, B, C, and Fail are ordered, with A being the highest grade and Fail being the lowest. These are called ordinal categorical variables. Variables in which the categories do not have an intrinsic order are called nominal categorical variables, such as the City variable, with the values of London, Manchester, Bristol, and so on.
The values of categorical variables are often encoded as strings. To train most machine learning models, we need to transform those strings into numbers. The act of replacing strings with numbers is called categorical...