Apache Superset Quick Start Guide

More Information
Learn
  • Get to grips with the fundamentals of data exploration using Superset
  • Set up a working instance of Superset on cloud services like Google Compute Engine
  • Integrate Superset with SQL databases
  • Build dashboards with Superset
  • Calculate statistics in Superset for numerical, categorical, or text data
  • Understand visualization techniques, filtering, and grouping by aggregation
  • Manage user roles and permissions in Superset
  • Work with SQL Lab
About

Apache Superset is a modern, open source, enterprise-ready business intelligence (BI) web application. With the help of this book, you will see how Superset integrates with popular databases like Postgres, Google BigQuery, Snowflake, and MySQL. You will learn to create real time data visualizations and dashboards on modern web browsers for your organization using Superset.

First, we look at the fundamentals of Superset, and then get it up and running. You'll go through the requisite installation, configuration, and deployment. Then, we will discuss different columnar data types, analytics, and the visualizations available. You'll also see the security tools available to the administrator to keep your data safe.

You will learn how to visualize relationships as graphs instead of coordinates on plain orthogonal axes. This will help you when you upload your own entity relationship dataset and analyze the dataset in new, different ways. You will also see how to analyze geographical regions by working with location data.

Finally, we cover a set of tutorials on dashboard designs frequently used by analysts, business intelligence professionals, and developers.

Features
  • Work with Apache Superset's rich set of data visualizations
  • Create interactive dashboards and data storytelling
  • Easily explore data
Page Count 188
Course Length 5 hours 38 minutes
ISBN 9781788992244
Date Of Publication 19 Dec 2018
Dataset
Distribution – histogram
Comparison – relationship between feature values
Comparison – box plots for groups of feature values
Comparison – side-by-side visualization of two feature values
Summary statistics – headline
Summary

Authors

Shashank Shekhar

Shashank Shekhar is a data analyst and open source enthusiast. He has contributed to Superset and pymc3 (the Python Bayesian machine learning library), and maintains several public repositories on machine learning and data analysis projects of his own on GitHub. He heads up the data science team at HyperTrack, where he designs and implements machine learning algorithms to obtain insights from movement data. Previously, he worked at Amino on claims data. He has worked as a data scientist in Silicon Valley for 5 years. His background is in systems engineering and optimization theory, and he carries that perspective when thinking about data science, biology, culture, and history.