Loading data from Amazon S3 using COPY
Amazon Redshift supports a number of data model structures, including dimensional, denormalized, and aggregate (rollup) structures, which makes it optimal for analytics.
In this recipe, we will set up two separate sample datasets in Amazon Redshift that are publicly available:
- A dimensional model using a star schema benchmark (SSB) (https://www.cs.umb.edu/~poneil/StarSchemaB.PDF), a retail system-based dataset
- A denormalized model using an Amazon.com customer product reviews dataset
To load the datasets, we will use the COPY command, which allows data to be copied from Amazon S3 to an Amazon Redshift data warehouse (serverless or provisioned cluster), which is the recommended way to load large data.
Getting ready
To complete this recipe, you will need:
- An Amazon Redshift data warehouse deployed in AWS region eu-west-1
- Amazon Redshift data warehouse admin user credentials
- Access to any...