Summary
In this chapter, we learned how choosing the most appropriate visualization(s) depends on four key elements:
- The nature of the features in a dataset: categorical/discrete, numerical/continuous numerical
- The size of the dataset: small/medium/large
- The density of the data points in the chosen feature space: whether too many or too few data points are set to certain feature values
- The context of the visualization: the source of the dataset and frequently used visualizations for the given application
For the purpose of explaining the concepts clearly and defining certain general guidelines, we classified visualizations into two categories:
- Plots representing the global patterns of the chosen features (for example, histograms, scatter plots, hexbin plots, contour plots, line plots,and heatmaps)
- Plots representing the summary statistics of the specific features (box plots and violin plots)
We are not implying that a single best visualization must be determined right away for any given application; for most datasets, the best visualizations will likely emerge from testing different kinds of plots and carefully examining the insights derived from each of them. This chapter provided the necessary resources to understand the interpretation and usage of various popular and less-used informative visualization types. In the next chapter, we will build on this foundation to introduce interactivity into our visualizations.