You're reading from The Data Visualization Workshop

Product typeBook

Published inJul 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781800568846

Edition1st Edition

Languages

Python

Tools

Jupyter

Concepts

Data Visualization

Authors (2):

Mario Döbler

Tim Großmann

View More author details

1. The Importance of Data Visualization and Data Exploration

Activity 1.01: Using NumPy to Compute the Mean, Median, Variance, and Standard Deviation of a Dataset

Solution:

Import NumPy:
```
import numpy as np
```

Load the normal_distribution.csv dataset by using the genfromtxt method from NumPy:

dataset = np.genfromtxt('../../Datasets/normal_distribution.csv', \
                        delimiter=',')

First, print a subset of the first two rows of the dataset:
```
dataset[0:2]
```
The output of the preceding code is as follows:
Figure 1.57: First two rows of the dataset
Load the dataset and calculate the mean of the third row. Access the third row by using index 2, dataset[2]:
```
np.mean(dataset[2])
```
The output of the preceding code is as follows:
```
100.20466135250001
```
Index the last element of an ndarray in the same way a regular Python list can be...

2. All You Need to Know about Plots

Activity 2.01: Employee Skill Comparison

Solution:

Bar charts and radar charts are great for comparing multiple variables for multiple groups.
Suggested response: The bar chart is great for comparing the skill attributes of the different employees, but it is not the best choice when it comes to getting an overall impression of an employee, due to the fact that the skills are not displayed directly next to one another.
The radar chart is great for this scenario because you can both compare performance across employees and directly observe the individual performance for each skill attribute.
Suggested response:
For both the bar and radar charts, adding a title and labels would help to understand the plots better. Additionally, using different colors for the different employees in the radar chart would help to keep the different employees apart.

Activity 2.02: Road Accidents Occurring over Two Decades

Solution:

Suggested...

3. A Deep Dive into Matplotlib

Activity 3.01: Visualizing Stock Trends by Using a Line Plot

Solution:

Visualize a stock trend by using a line plot:

Create an Activity3.01.ipynb Jupyter notebook in the Chapter03/Activity3.01 folder to implement this activity.

Import the necessary modules and enable plotting within the Jupyter notebook:

# Import statements
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
%matplotlib inline

Use pandas to read the datasets (GOOGL_data.csv, FB_data.csv, AAPL_data.csv, AMZN_data.csv, and MSFT_data.csv) located in the Datasets folder. The read_csv() function reads a .csv file into a DataFrame:

# load datasets
google = pd.read_csv('../../Datasets/GOOGL_data.csv')
facebook = pd.read_csv('../../Datasets/FB_data.csv')
apple = pd.read_csv('../../Datasets/AAPL_data.csv')
amazon = pd.read_csv('../../Datasets/AMZN_data.csv')
microsoft = pd.read_csv('../../Datasets/MSFT_data.csv...

4. Simplifying Visualizations Using Seaborn

Activity 4.01: Using Heatmaps to Find Patterns in Flight Passengers' Data

Solution:

Find the patterns in the flight passengers' data with the help of a heatmap:

Create an Activity4.01.ipynb Jupyter notebook in the Chapter04/Activity4.01 folder to implement this activity.

Import the necessary modules and enable plotting within a Jupyter notebook:

%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

Use pandas to read the flight_details.csv dataset located in the Datasets folder. The given dataset contains the monthly figures for flight passengers for the years 1949 to 1960:
```
data = pd.read_csv("../../Datasets/flight_details.csv")
```
Now, we can use the pivot() function to transform the data into a format that is suitable for heatmaps:
```
data = data.pivot("Months", "Years", "Passengers")
data = data.reindex...
```

5. Plotting Geospatial Data

Activity 5.01: Plotting Geospatial Data on a Map

Solution:

Let's plot the geospatial data on a map and find the densely populated areas of cities in Europe that have population of more than 100,000:

Create an Activity5.01.ipynb Jupyter notebook in the Chapter05/Activity5.01 folder to implement this activity and then import the necessary dependencies:
```
import numpy as np
import pandas as pd
import geoplotlib
```
Load the world_cities_pop.csv dataset from the Datasets folder using pandas:
```
#loading the Dataset (make sure to have the dataset downloaded)
dataset = pd.read_csv('../../Datasets/world_cities_pop.csv', \
                      dtype = {'Region': np.str})
```
Note
If we import our dataset without defining the dtype attribute of the Region column as a String type, we will get a warning telling us that...

6. Making Things Interactive with Bokeh

Activity 6.01: Plotting Mean Car Prices of Manufacturers

Solution:

Create an Activity6.01.ipynb Jupyter notebook in the Chapter06/Activity6.01 folder.

Import the necessary libraries:

import pandas as pd
from bokeh.io import output_notebook
output_notebook()

Load the automobiles.csv dataset from the Datasets folder:

dataset = pd.read_csv('../../Datasets/automobiles.csv')

Use the head method to print the first five rows of the dataset:
```
dataset.head()
```
The following figure shows the output of the preceding code:

Figure 6.36: Loading the top five rows of the automobile dataset

Plotting each car with its price

Use the plotting interface of Bokeh to do some basic visualization first. Let's plot each car with its price. Import figure and show from the bokeh.plotting interface:
```
from bokeh.plotting import figure, show
```
First, use the index as our x-axis since we just want to plot each car with its price...

7. Combining What We Have Learned

Activity 7.01: Implementing Matplotlib and Seaborn on the New York City Database

Solution:

Create an Activity7.01.ipynb Jupyter Notebook in the Chapter07/Activity7.01 folder to implement this activity. Import all the necessary libraries:

# Import statements
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
import squarify
sns.set()

Use pandas to read both CSV files located in the Datasets folder:

p_ny = pd.read_csv('../../Datasets/acs2017/pny.csv')
h_ny = pd.read_csv('../../Datasets/acs2017/hny.csv')

Use the given PUMA (public use microdata area code based on the 2010 census definition, which are areas with populations of 100,000 or more) ranges to further divide the dataset into NYC districts (Bronx, Manhattan, Staten Island, Brooklyn, and Queens):
```
# PUMA ranges
bronx = [3701, 3710]
manhatten = [3801, 3810]
staten_island = [3901, 3903]
brooklyn =...
```

The rest of the chapter is locked

You have been reading a chapter from

The Data Visualization Workshop

Published in: Jul 2020Publisher: PacktISBN-13: 9781800568846

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Mario Döbler

Mario Döbler is a Ph.D. student with a focus on deep learning at the University of Stuttgart. He previously interned at the Bosch Center for artificial intelligence in the Silicon Valley in the field of deep learning. He used state-of-the-art algorithms to develop cutting-edge products. In his master thesis, he dedicated himself to applying deep learning to medical data to drive medical applications.
Read more about Mario Döbler

Tim Großmann

Tim Großmann is a computer scientist with interest in diverse topics, ranging from AI and IoT to Security. He previously worked in the field of big data engineering at the Bosch Center for Artificial Intelligence in Silicon Valley. In addition to that, he worked on an Eclipse project for IoT device abstractions in Singapore. He's highly involved in several open-source projects and actively speaks at tech meetups and conferences about his projects and experiences.
Read more about Tim Großmann

Other recommended products

Related to this chapter

The Data Visualization Workshop

Cut through the noise and get real results with a step-by-step approach to learning data visualization with Python

BookFeb 2020480 pages

Hands-On Data Visualization with Bokeh

Adding a layer of interactivity to your plots and converting these plots into applications hold immense value in the field of data science. The standard approach to adding interactivity would be to use paid software such as Tableau, but the Bokeh package in Python offers users a way to create both interactive and visually aesthetic plots for free.

BookJun 2018174 pages

Interactive Data Visualization with Python

Interactive Data Visualization with Python sharpens your data exploration skills, tells you everything there is to know about interactive data visualization in Python, and most importantly, helps you make your storytelling more intuitive and persuasive.

BookOct 2019362 pages

Interactive Data Visualization with Python

Interactive Data Visualization with Python sharpens your data exploration skills, tells you everything there is to know about interactive data visualization in Python, and most importantly, helps you make your storytelling more intuitive and persuasive.

BookApr 2020362 pages

Matplotlib 3.0 Cookbook

This book presents highly practical, ready to implement recipes on using Python's Matplotlib package for effective data visualization. It contains quick solutions to the common and not-so-common problems encountered while designing different types of visualizations, including histograms, bar plots, and other advanced charts.

BookOct 2018676 pages

Matplotlib 2.x By Example

Big data analytics are driving innovations in scientific research, digital marketing, policymaking and much more. Matplotlib offers simple but powerful plotting interface, versatile plot types and robust customizations, which help resolve the complexity in Big data visualization. “Matplotlib 2.x By Example” illustrates the methods and applications of various plot types through real world examples. It begins by giving readers the basic knowhow on how to create and customize plots by Matplotlib. It further covers how to plot different types of economic data in the form of 2D and 3D graphs, which give insights from a deluge of data from public repositories, such as Quandl Finance. You will learn to visualize geographical data on maps and implement interactive charts. By the end of this book, you will become well versed with Matplotlib in your day-to-day work to perform advanced data visualization.

BookAug 2017334 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages