Reader small image

You're reading from  Data Wrangling on AWS

Product typeBook
Published inJul 2023
PublisherPackt
ISBN-139781801810906
Edition1st Edition
Tools
Right arrow
Authors (3):
Navnit Shukla
Navnit Shukla
author image
Navnit Shukla

Navnit Shukla is an accomplished Senior Solution Architect with a specialization in AWS analytics. With an impressive career spanning 12 years, he has honed his expertise in databases and analytics, establishing himself as a trusted professional in the field. Currently based in Orange County, CA, Navnit's primary responsibility lies in assisting customers in building scalable, cost-effective, and secure data platforms on the AWS cloud.
Read more about Navnit Shukla

Sankar M
Sankar M
author image
Sankar M

Sankar Sundaram has been working in IT Industry since 2007, specializing in databases, data warehouses, analytics space for many years. As a specialized Data Architect, he helps customers build and modernize data architectures and help them build secure, scalable, and performant data lake, database, and data warehouse solutions. Prior to joining AWS, he has worked with multiple customers in implementing complex data architectures.
Read more about Sankar M

Sampat Palani
Sampat Palani
author image
Sampat Palani

Sam Palani has over 18+ years as developer, data engineer, data scientist, a startup cofounder and IT leader. He holds a master's in Business Administration with a dual specialization in Information Technology. His professional career spans across 5 countries across financial services, management consulting and the technology industries. He is currently Sr Leader for Machine Learning and AI at Amazon Web Services, where he is responsible for multiple lines of the business, product strategy and thought leadership. Sam is also a practicing data scientist, a writer with multiple publications, speaker at key industry conferences and an active open source contributor. Outside work, he loves hiking, photography, experimenting with food and reading.
Read more about Sampat Palani

View More author details
Right arrow

Working with QuickSight

In the previous chapter, you learned about the Amazon Athena service and how it can be used for data discovery, data enrichment, and data quality pipelines effectively. In this chapter, we will explore how the Amazon QuickSight service can help with the data discovery and data visualization phases of the data-wrangling pipeline.

This chapter covers the following topics:

  • Introducing Amazon QuickSight and its concepts
  • Data discovery using QuickSight
  • Data visualization using QuickSight

Introducing Amazon QuickSight and its concepts

Amazon QuickSight is a Business Intelligence (BI) service that helps you to generate interactive dashboards with visuals/charts from a wide variety of data sources, such as the cloud, on-premises, and third-party services. In addition, Amazon QuickSight provides advanced features such as embedded analytics and an AI-enabled search bar.

Why do we need a data visualization tool?

As mentioned previously, a picture is worth a thousand words. Users can understand patterns in data using visuals and charts more easily than tabular data. Also, when data insights are identified, they can be shared with business and executive teams through dashboards to convey the message more effectively.

How does QuickSight work?

QuickSight loads data from supported data sources and users can prepare data within QuickSight to suit their visualization requirements. The data is retrieved locally from the Super-fast, Parallel, In-memory Calculation Engine...

Data discovery with QuickSight

Amazon QuickSight supports loading data from various data sources, and we can then create visuals in the Analyses tab to understand the data. Data discovery can also be done using Jupyter notebooks with custom visualization libraries, but that might require programming expertise and complex setup before performing data discovery activities. In contrast, business users can perform data discovery in QuickSight with visuals in the Analyses tab.

QuickSight-supported data sources and setup

QuickSight supports a wide variety of data sources. The complete list can be found at https://docs.aws.amazon.com/quicksight/latest/user/supported-data-sources.html.

The sources could be classified into the following broad categories:

  • Relational data sources: Covering cloud and on-premises data sources, including popular data engines such as MySQL, Postgres, SQL Server, Oracle, Snowflake, and Redshift. When connecting to on-premises data sources, you need...

Data visualization with QuickSight

In this section, we will explore different visuals and options that are available within QuickSight. We will explore high-level concepts and different visuals and features within QuickSight in this section.

Visualization and charts with QuickSight

The following steps are performed when publishing a dashboard in QuickSight.

Figure 8.24: QuickSight – high-level dashboard publishing workflow

Figure 8.24: QuickSight – high-level dashboard publishing workflow

  1. Create data source: This is the step where the connection to the data source is established. This can be a one-time activity and can be reused multiple times when new datasets are created from the data source.
  2. Create dataset: A dataset can be created from a new data source or existing datasets for different analysis requirements. The datasets can be modified, parsed, or enriched based on specific analysis requirements.
  3. Create analysis (create visuals): Create visuals from a specific dataset. Here, the dashboard...

Summary

In this chapter, we explored the Amazon QuickSight service and how it helps in various phases of the data-wrangling pipeline. We learned about various AWS services and how they fit into the data-wrangling pipeline and can perform certain data-wrangling activities in an efficient manner.

In the next chapter, we will learn about practical use cases for a data-wrangling pipeline.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Wrangling on AWS
Published in: Jul 2023Publisher: PacktISBN-13: 9781801810906
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Navnit Shukla

Navnit Shukla is an accomplished Senior Solution Architect with a specialization in AWS analytics. With an impressive career spanning 12 years, he has honed his expertise in databases and analytics, establishing himself as a trusted professional in the field. Currently based in Orange County, CA, Navnit's primary responsibility lies in assisting customers in building scalable, cost-effective, and secure data platforms on the AWS cloud.
Read more about Navnit Shukla

author image
Sankar M

Sankar Sundaram has been working in IT Industry since 2007, specializing in databases, data warehouses, analytics space for many years. As a specialized Data Architect, he helps customers build and modernize data architectures and help them build secure, scalable, and performant data lake, database, and data warehouse solutions. Prior to joining AWS, he has worked with multiple customers in implementing complex data architectures.
Read more about Sankar M

author image
Sampat Palani

Sam Palani has over 18+ years as developer, data engineer, data scientist, a startup cofounder and IT leader. He holds a master's in Business Administration with a dual specialization in Information Technology. His professional career spans across 5 countries across financial services, management consulting and the technology industries. He is currently Sr Leader for Machine Learning and AI at Amazon Web Services, where he is responsible for multiple lines of the business, product strategy and thought leadership. Sam is also a practicing data scientist, a writer with multiple publications, speaker at key industry conferences and an active open source contributor. Outside work, he loves hiking, photography, experimenting with food and reading.
Read more about Sampat Palani