Reader small image

You're reading from  Interactive Dashboards and Data Apps with Plotly and Dash

Product typeBook
Published inMay 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800568914
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Elias Dabbas
Elias Dabbas
author image
Elias Dabbas

Elias Dabbas is an online marketing and data science practitioner. He produces open-source software for building dashboards, data apps, as well as software for online marketing, with a focus on SEO, SEM, crawling, and text analysis.
Read more about Elias Dabbas

Right arrow

Chapter 4: Data Manipulation and Preparation, Paving the Way to Plotly Express

We saw that preparing data can take much more mental effort and code than the process of creating charts. Or, to put it differently, if we invest a good amount of time in preparing our data and making certain decisions about how and what we intend to do with it, the process of visualization can be made much easier. So far, we have used a small part of our dataset and didn't make any changes to its shape or format. And when making our charts, we followed the approach of building them from scratch by creating a figure and then adding different layers and options for traces, titles, and so on.

In this chapter, we will go through a thorough familiarization with the dataset and reshape it to an intuitive and easy-to-use format. This will help us in using a new approach for creating visualizations, using Plotly Express. Instead of starting with an empty rectangle and building layers on top of it, we will...

Technical requirements

Technically, no new packages will be used in this chapter, but as a major module of Plotly, we can consider Plotly Express to be a new one. We will also be extensively using pandas for data preparation, reshaping, and general manipulation. This will mainly be done in JupyterLab. Our dataset will consist of the files in the data folder in the root of the repository.

The code files of this chapter can be found on GitHub at https://github.com/PacktPublishing/Interactive-Dashboards-and-Data-Apps-with-Plotly-and-Dash/tree/master/chapter_04.

Check out the following video to see the Code in Action at https://bit.ly/3suvKi4.

Let's start by exploring the different formats in which we can have data, and what we can do about it.

Understanding long format (tidy) data

We have a moderately complex dataset that we will be working with. It consists of four CSV files, containing information on almost all the countries and regions in the world. We have more than 60 metrics spanning more than 40 years, which means that there are quite a lot of options and combinations to choose from.

But before going through the process of preparing our dataset, I'd like to demonstrate our end goal with a simple example, so you have an idea of where we are heading. It will also hopefully show why we are investing time in making those changes.

Plotly Express example chart

Plotly Express ships with a few datasets for practicing and testing certain features whenever you want to do so. They fall under the data module of plotly.express, and calling them as functions returns the respective dataset. Let's take a look at the famous Gapminder dataset:

import plotly.express as px
gapminder = px.data.gapminder()
gapminder...

Understanding the role of data manipulation skills

In practical situations, we rarely have our data in the format that we want; we usually have different datasets that we want to merge, and often, we need to normalize and clean up the data. For these reasons, data manipulation and preparation will always play a big part in any data visualization process. So, we will be focusing on this in this chapter and throughout the book.

The plan for preparing our dataset is roughly the following:

  1. Explore the different files one by one.
  2. Check the available data and data types and explore how each can help us categorize and analyze the data.
  3. Reshape the data where required.
  4. Combine different DataFrames to add more ways to describe our data.

Let's go through these steps right away.

Exploring the data files

We start by reading in the files in the data folder:

import os
import pandas as pd
pd.options.display.max_columns = None
os.listdir('data&apos...

Learning Plotly Express

Plotly Express is a higher-level plotting system, built on top of Plotly. Not only does it handle certain defaults for us, such as labeling axes and legends, it enables us to utilize our data to express many of its attributes using visual aesthetics (size, color, location, and so on). This can be done simply by declaring what attribute we want to express with which column of our data, given a few assumptions about the data structure. So, it mainly provides us with the flexibility to approach the problem from the data point of view, as mentioned at the beginning of the chapter.

Let's first create a simple DataFrame:

df = pd.DataFrame({
    'numbers': [1, 2, 3, 4, 5, 6, 7, 8],
    'colors': ['blue', 'green', 'orange', 'yellow', 'black', 'gray', 'pink', 'white'],
    'floats': [1.1...

Summary

You now have enough information and have seen enough examples to create dashboards quickly. In Chapter 1, Overview of the Dash Ecosystem, we learned how apps are structured and learned how to build fully running apps, but without interactivity. In Chapter 2, Exploring the Structure of a Dash App, we explored how interactivity works, through callback functions, and we added interactive features to our app. Chapter 3, Working with Plotly's Figure Objects, introduced how Plotly's charts are created, their components, and how to manipulate them to achieve the results you want. Finally, in this chapter, we introduced Plotly Express, a high-level interface to Plotly that is easy to use but more importantly, follows an intuitive approach that is data-oriented, as opposed to being chart-oriented.

One of the most important and biggest parts of creating visualizations is the process of preparing data in certain formats, after which it becomes relatively straightforward to...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Interactive Dashboards and Data Apps with Plotly and Dash
Published in: May 2021Publisher: PacktISBN-13: 9781800568914
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Elias Dabbas

Elias Dabbas is an online marketing and data science practitioner. He produces open-source software for building dashboards, data apps, as well as software for online marketing, with a focus on SEO, SEM, crawling, and text analysis.
Read more about Elias Dabbas