Reader small image

You're reading from  Data Engineering with AWS - Second Edition

Product typeBook
Published inOct 2023
PublisherPackt
ISBN-139781804614426
Edition2nd Edition
Right arrow
Author (1)
Gareth Eagar
Gareth Eagar
author image
Gareth Eagar

Gareth Eagar has over 25 years of experience in the IT industry, starting in South Africa, working in the United Kingdom for a while, and now based in the USA. Having worked at AWS since 2017, Gareth has broad experience with a variety of AWS services, and deep expertise around building data platforms on AWS. While Gareth currently works as a Solutions Architect, he has also worked in AWS Professional Services, helping architect and implement data platforms for global customers. Gareth frequently speaks on data related topics.
Read more about Gareth Eagar

Right arrow

Visualizing Data with Amazon QuickSight

In Chapter 11, Ad Hoc Queries with Amazon Athena, we looked at how Amazon Athena enables data analysts to run ad hoc queries against data in the data lake using the power of SQL and Spark. And while this is an extremely powerful tool for querying large datasets, often, the quickest way to understand a summary of a dataset is to visualize the data in graphs and dashboards.

In this chapter, we will do a deeper dive into Amazon QuickSight, a business intelligence (BI) tool that enables the creation of rich visualizations that summarize data, with the ability to filter and drill down into datasets in numerous ways. In addition, QuickSight also enables the creation of formatted, multi-page reports, and brings advanced functionality, such as the ability to ask questions of data in natural language.

In smaller organizations, a data engineer may be tasked with setting up and configuring a BI tool that data consumers can use. Things may be different...

Technical requirements

At the end of this chapter, you will get hands-on by creating a QuickSight visual from scratch. To complete the steps in the hands-on section, you will need the appropriate user permissions to sign up for a QuickSight subscription.

If you have administrator permissions for your AWS account, these permissions should be sufficient to sign up for a QuickSight subscription. If not, you will need to work with your IAM security team to create a custom policy. See the AWS documentation titled IAM Policy Examples for Amazon QuickSight and refer to the All Access for Standard Edition example policy as a reference.

At the time of writing, Amazon QuickSight includes a free trial subscription for 30 days for new QuickSight subscriptions. If you do not intend to use QuickSight past these 30 days, ensure that your user is also granted the quicksight:Unsubscribe permission so that you can unsubscribe from QuickSight after completing the hands-on section.

...

Representing data visually for maximum impact

Data lakes are designed to capture large amounts of raw data and enable the processing of that data to draw out new insights that provide business value. The insights that are gained from a data lake can be represented in many ways, such as reports that summarize sales data and top sales items, machine learning (ML) models that can predict future trends, and visualizations and dashboards that effectively summarize data. Each of these ways of representing data offers different benefits, depending on the business purpose:

  • If you’re a data analyst who needs to report sales figures, profit margins, inventory levels, and other data for each category of product a company produces, you would probably want access to detailed tabular data. You would want the power of SQL to run powerful queries against the data to draw varied insights so that you can provide this data to different departments within the organization.
  • If...

Understanding Amazon QuickSight’s core concepts

At its core, QuickSight lets us ingest data from a wide variety of sources, perform some filtering or other transformation tasks on the data, and then create dashboards with multiple types of visuals that can be easily shared with others, or highly formatted multi-page PDF reports.

The QuickSight service is fully managed by AWS, and there are no upfront costs for using the service. Instead, the service uses a pricing model of a monthly cost per user and offers both Standard and Enterprise editions. To include specific functionality, such as QuickSight Q (for making natural language queries of data), a higher price per user is charged. There is also an option for capacity pricing, where you pay for the number of sessions per month, or per year, instead of per user.

QuickSight also includes a powerful in-memory storage and computation engine to enable the best performance for working with a variety of data sources. In...

Ingesting and preparing data from a variety of sources

Amazon QuickSight can use other AWS services as a source, as well as on-premises databases, imported files, and even some Software as a Service (SaaS) applications.

For example, you can easily connect to Oracle, Microsoft SQL Server, Postgres, and MySQL databases, either running as part of the Amazon RDS managed database service, or as instances running on Amazon EC2 or in your own data centers. You can also connect to data warehouse systems such as Amazon Redshift, Snowflake, and Teradata. Other AWS services are also supported as data sources, including Amazon S3, Amazon Athena, Amazon OpenSearch Service, Amazon Aurora, and AWS IoT Analytics.

In addition to these traditional data sources, QuickSight can also connect to various SaaS offerings, including ServiceNow, Jira, Adobe Analytics, Salesforce, GitHub, and Twitter.

Data stored in files, such as a Microsoft Excel Spreadsheet (XLSX files), JSON documents, and CSV...

Creating and sharing visuals with QuickSight analyses and dashboards

Once a dataset has been imported (and optionally transformed), you can create visualizations of this data using QuickSight analyses. This is the tool that is used by QuickSight authors to create new dashboards, with these dashboards containing one or more visualizations that can be shared with others in the business.

When you create a new analysis/dashboard, you choose one or more datasets to include in the analysis (up to a maximum of 50 datasets per dashboard). Each analysis consists of one or more sheets (or tabs, much like browser tabs) that display a group of visualizations. You can have up to 20 sheets (tabs) per dashboard, and each sheet can have up to 30 visualizations.

Once you have created an analysis (consisting of multiple visuals, optionally across multiple sheets), you can choose to publish the analysis as a dashboard. When you’re publishing a dashboard, you can select various parameters...

Understanding QuickSight’s advanced features

The Enterprise edition of Amazon QuickSight includes advanced features that can help you draw out additional insights from your data, ask questions of your data using natural language, and enable you to widely share your data by embedding dashboards into applications. We will review some of these features next.

Amazon QuickSight ML Insights

QuickSight ML Insights uses the power of ML algorithms to automatically uncover insights and trends, forecast future data points, and identify anomalies in your data.

All of these ML Insights functionalities can easily be added to an analysis/dashboard without the author needing to have any ML experience or any real understanding of the underlying ML algorithms. However, for those who are interested in the underlying ML algorithms used by QuickSight, Amazon provides comprehensive documentation on this topic.

Review the Amazon QuickSight documentation titled Understanding the...

Hands-on – creating a simple QuickSight visualization

Earlier in this chapter, we discussed how data can be represented over a geographic area. We used the example of data containing information on the population of world cities, and how we could use that to easily visualize how large cities are geographically distributed. The example visual in Figure 12.2 showed cities with a population of over 5 million people, displayed on top of a map of the world.

For the hands-on section of this chapter, we are going to recreate that visual using Amazon QuickSight.

Setting up a new QuickSight account and loading a dataset

Before we start creating a new dashboard, we need to download a sample dataset of world city populations. We will use the basic dataset available from https://simplemaps.com/, which is freely distributed under the Creative Commons Attribution 4.0 license (https://creativecommons.org/licenses/by/4.0/):

  1. Use the following link to download the basic...

Summary

In this chapter, you learned more about the Amazon QuickSight service, a BI tool that is used to create and share rich visualizations of data.

We discussed the power of visually representing data, and then explored core Amazon QuickSight concepts. We looked at how various data sources can be used with QuickSight, how data can optionally be imported into the SPICE storage engine, and how you can perform some data preparation tasks using QuickSight.

We then did a deeper dive into the concepts of analyses (where new visuals are authored) and dashboards (published analyses that can be shared with data consumers). As part of this, we also examined some of the common types of visualizations available in QuickSight.

We then looked at some of the advanced features available in QuickSight. This included ML Insights (which uses ML to detect outliers in data and forecast future data trends), QuickSight Q (which enables the use of natural language queries to create visualizations...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Engineering with AWS - Second Edition
Published in: Oct 2023Publisher: PacktISBN-13: 9781804614426
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Gareth Eagar

Gareth Eagar has over 25 years of experience in the IT industry, starting in South Africa, working in the United Kingdom for a while, and now based in the USA. Having worked at AWS since 2017, Gareth has broad experience with a variety of AWS services, and deep expertise around building data platforms on AWS. While Gareth currently works as a Solutions Architect, he has also worked in AWS Professional Services, helping architect and implement data platforms for global customers. Gareth frequently speaks on data related topics.
Read more about Gareth Eagar