Data analysis process methodologies
In the last decade, there has been a huge growth in the data analysis field. Lots of efforts are being made to establish the standard methodologies for data analysis and data analysis-based application development. In this section, we will discuss various process methodologies such as KDD, SEMMA, CRISP-DM, and the standard process. These methodologies have few overlapping or similar steps with different objectives.
Knowledge discovery from data (KDD)
Knowledge Discovery from Data is what KDD stands for. Data mining is also known by the term KDD. The practice of discovering and utilizing patterns for knowledge discovery is known as data mining. Finding hidden patterns in the data sources that are provided is the primary objective of the KDD process. There are seven main phases to the KDD process:
- Data cleaning: In this first stage, we handle the noisy data, missing values, duplicates, and outliers in the given dataset.
- Data integration: Then, data migration...