Exploratory Data Analysis
The previous chapter covered the basic plotting principles using ggplot2
, including the use of various geometries and themes layers. It turns out that cleaning and massaging the raw data (covered in Chapter 2 and Chapter 3) and visualizing the data (covered in Chapter 4) belong to the first stage of a typical data science project workflow – that is, exploratory data analysis (EDA). We will cover this using a few case studies in this chapter. We will learn how to apply the coding techniques we covered earlier in this book and focus on analyzing the data through the lens of EDA.
By the end of this chapter, you will know how to uncover the structures of data using numerical and graphical techniques, discover interesting relationships among variables, and spot unusual observations.
In this chapter, we will cover the following topics:
- EDA fundamentals
- EDA in practice