ANALYZING MISSING DATA
This section contains subsections that describes types of missing data, common causes of missing data, and various ways to input values for missing data. Keep in mind that outlier detection, fraud detection, and anomaly detection pertain to analyzing existing data.
By contrast, missing data presents a different issue, which in turn raises the following question: what can you do about the missing values? Is it better to discard data points (e.g., rows in a CSV file) with missing values, or is it better to estimate reasonable values as a replacement for the missing values? Also keep in mind that missing data can adversely affect a thorough analysis of a dataset, whereas erroneous data can increase bias and uncertainty.
At this point you’ve undoubtedly realized that a single solution does not exist for every dataset: you need to perform an analysis on a case-by-case basis, after you have learned some of the techniques that might help you effectively address...