Reader small image

You're reading from  Hands-On Data Visualization with Bokeh

Product typeBook
Published inJun 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789135404
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kevin Jolly
Kevin Jolly
author image
Kevin Jolly

Kevin Jolly is a formally educated data scientist with a master's degree in data science from the prestigious King's College London. Kevin works as a statistical analyst with a digital healthcare start-up, Connido Limited, in London, where he is primarily involved in leading the data science projects that the company undertakes. He has built machine learning pipelines for small and big data, with a focus on scaling such pipelines into production for the products that the company has built. Kevin is also the author of a book titled Hands-On Data Visualization with Bokeh, published by Packt. He is the editor-in-chief of Linear, a weekly online publication on data science software and products.
Read more about Kevin Jolly

Right arrow

The Bokeh Workflow – A Case Study

When it comes to building your very own Bokeh visualization from scratch, a good practice to develop is to never start with Bokeh. Instead, the ideal approach is to perform a little exploratory analysis on your data first, in order to visualize the application you can create using Bokeh that can deliver the most value to your users.

Such an approach, of first exploring your dataset, helps you formulate the ideal visualization that you might want to present to your audience.

In this chapter, you will learn the exact workflow that you need to follow, from when you get the data to the final visualization that you want to present.

Bokeh, like most data visualization tools, is best used in a workflow that follows a logical sequence of steps, which will allow you to deliver impactful insights to your audience. This workflow can be summarized...

Technical requirements

Asking the right question

Asking the right question is by far the most important step when it comes to data visualization. What is the answer that you are seeking?

Some of the most common questions that you need to ask yourself before deciding to visualize data are:

  • Do I want to observe how well two features are correlated?
  • Do I suspect potential outliers in my data that I cannot see unless I visualize my data?
  • Do I want to see whether my data shows a particular trend over a period of time?
  • Do I want to observe the distribution of individual features/columns in my data?
  • Do I want to see whether there are clusters/groups within my data that I can potentially extract value from?
  • Do I believe that a visualization can tell my audience a story about the data?

If the answer to any one of these questions is a yes, then you know that you need to visualize your data. The second question...

The exploratory data analysis

Since we have worked extensively with the S&P 500 stock data from Kaggle, we are going to be using that dataset in order to create our application. The dataset can be found here: https://www.kaggle.com/camnugent/sandp500/data.

The first step is to read the data into Jupyter Notebook and understand what the data looks like. This can be done using the code shown here:

#Import packages

import pandas as pd

#Read the data into the notebook

df = pd.read_csv('all_stocks_5yr.csv')

#Extract information about the data

df.info()

This renders the output shown in this screenshot:

This sheds information on the number of rows the dataset has, the data types of each column, the number of variables, and any missing values.

The next step is to understand the kind of information contained in all the columns of your dataset. We can do this by using the...

Creating an insightful visualization

Now that we have a fundamental idea of what our data contains, we can proceed to making the visualization. The first step is to ensure we have the foundation of the visualization ready.

Creating the base plot

The foundation consists of the base plot that you want to visualize. In our case, we want to see how the volume of stocks traded over a period of time correlates with the high prices. In order to build this application, we use the code shown here:

#Import the required packages

from bokeh.io import curdoc
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure
import pandas as pd

#Read the data into the notebook

df = pd.read_csv('all_stocks_5yr.csv')

#Convert...

Presenting your results

The right visualization is not just limited to picking the right type of plot, such as scatter plots or bar charts. It extends to picking the right colors, shapes, markers, and features.

Some of the questions that you will want to ask yourself when choosing the right visualization are as follows:

  • Do I want to transmit a positive message to my readers? If yes, the colors green and blue are a great choice
  • Do I want to transmit an alarming/negative message, indicating some form of danger/decline to my readers? If yes, the color red works best
  • Do I want to show how two different segments/categories differ from each other? If yes, using contrasting colors such as red and blue works well

The tone of the insight and message that you want to convey is critical when it comes to creating the ideal visualization.

Summary

In this chapter, you learned how to build a real-time Bokeh visualization that can be used to analyze the performance of stocks from scratch. You learned how to perform initial exploratory data analysis in order to determine the kind of visualization that you wanted to create. You then created the visualization and improved its performance using WebGL.

Finally, you learned the four steps that form an integral part of the Bokeh workflow. You learned how asking the right kinds of questions is pivotal in any data visualization project, followed by the exploratory data analysis. You also learned how presenting your results is not limited to the type of plot you use, but also the tone of the message that you want to convey to the audience.

This concludes the book! I hope the book has given you an informative, hands-on introduction to the world of Bokeh! I hope you will continue...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Data Visualization with Bokeh
Published in: Jun 2018Publisher: PacktISBN-13: 9781789135404
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kevin Jolly

Kevin Jolly is a formally educated data scientist with a master's degree in data science from the prestigious King's College London. Kevin works as a statistical analyst with a digital healthcare start-up, Connido Limited, in London, where he is primarily involved in leading the data science projects that the company undertakes. He has built machine learning pipelines for small and big data, with a focus on scaling such pipelines into production for the products that the company has built. Kevin is also the author of a book titled Hands-On Data Visualization with Bokeh, published by Packt. He is the editor-in-chief of Linear, a weekly online publication on data science software and products.
Read more about Kevin Jolly