Summary
This chapter focused on defining datasets from existing datasets. It began by highlighting how SELECT queries produce two-dimensional relations and how these relations can be used as tables in a database by creating a subquery. To avoid duplicated coding and to simplify complex subqueries, you can build CTEs and views based on the subqueries.
Then, it introduced the concept of joining tables to combine data from multiple tables into one dataset. The basics of joins were explained, including inner joins and outer joins. Finally, this chapter discussed set operations, including UNION, UNION ALL, INTERSECT, and EXCEPT. With these topics, you will be able to build advanced research by combining data from multiple tables out of complex relationships.
In the next chapter, we will discuss how to aggregate data from raw tables into grouped and distilled datasets.