Reader small image

You're reading from  Apache Superset Quick Start Guide

Product typeBook
Published inDec 2018
Reading LevelIntermediate
Publisher
ISBN-139781788992244
Edition1st Edition
Languages
Right arrow
Author (1)
Shashank Shekhar
Shashank Shekhar
author image
Shashank Shekhar

Shashank Shekhar is a data analyst and open source enthusiast. He has contributed to Superset and pymc3 (the Python Bayesian machine learning library), and maintains several public repositories on machine learning and data analysis projects of his own on GitHub. He heads up the data science team at HyperTrack, where he designs and implements machine learning algorithms to obtain insights from movement data. Previously, he worked at Amino on claims data. He has worked as a data scientist in Silicon Valley for 5 years. His background is in systems engineering and optimization theory, and he carries that perspective when thinking about data science, biology, culture, and history.
Read more about Shashank Shekhar

Right arrow

Dataset

We will be working with trading data on commodities in this chapter. The Federal Reserve Bank of St Louis, United States, compiles data on commodities. Datasets are available on http://fred.stlouisfed.org. You can obtain time series data on import values and import volumes of commodities traded by the United States. We will download data on bananas, olive oil, sugar, uranium, cotton, oranges, wheat, aluminium, iron, and corn.

Inside the chapter directory of the GitHub repository, you will find the generate_dataset.ipynb Jupyter Notebook. Just run the Notebook to download, transform, and generate the two CSV files we will upload. If you want to skip running the Notebook, the two CSV files, fsb_st_louis_commodities.csv and usda_oranges_and_bananas_data.csv, are also present in the repository, ready for upload.

The FSB data on commodity prices in fsb_st_louis_commodities...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Apache Superset Quick Start Guide
Published in: Dec 2018Publisher: ISBN-13: 9781788992244

Author (1)

author image
Shashank Shekhar

Shashank Shekhar is a data analyst and open source enthusiast. He has contributed to Superset and pymc3 (the Python Bayesian machine learning library), and maintains several public repositories on machine learning and data analysis projects of his own on GitHub. He heads up the data science team at HyperTrack, where he designs and implements machine learning algorithms to obtain insights from movement data. Previously, he worked at Amino on claims data. He has worked as a data scientist in Silicon Valley for 5 years. His background is in systems engineering and optimization theory, and he carries that perspective when thinking about data science, biology, culture, and history.
Read more about Shashank Shekhar