Exploring and understanding the dataset
As we learned in Chapter 4, Predicting Numerical Values with Linear Regression, before diving into the ML implementation, it's necessary to analyze the data available for our use case. We need to begin by having a clear understanding of the data that can be used for our business scenario.
Understanding the data
To start exploring the data, we need to do the following:
- Log in to Google Cloud Console and access the BigQuery user interface from the navigation menu.
- Create a new dataset in the project that we created in Chapter 2, Setting Up Your GCP and BigQuery Environment. For this use case, we'll create the
05_chicago_taxidataset with the default options. - Open the
bigquery-public-dataGCP project that hosts all the BigQuery public datasets and browse the items until you find thechicago_taxi_tripsdataset. In this public dataset, we can see only one BigQuery table:taxi_trips. This table contains all the information...