Reader small image

You're reading from  Learning IPython for Interactive Computing and Data Visualization, Second Edition

Product typeBook
Published inOct 2015
Reading LevelBeginner
Publisher
ISBN-139781783986989
Edition1st Edition
Languages
Right arrow
Author (1)
Cyrille Rossant
Cyrille Rossant
author image
Cyrille Rossant

Cyrille Rossant, PhD, is a neuroscience researcher and software engineer at University College London. He is a graduate of École Normale Supérieure, Paris, where he studied mathematics and computer science. He has also worked at Princeton University and Collège de France. While working on data science and software engineering projects, he gained experience in numerical computing, parallel computing, and high-performance data visualization. He is the author of Learning IPython for Interactive Computing and Data Visualization, Second Edition, Packt Publishing.
Read more about Cyrille Rossant

Right arrow

Exploring a dataset in the Notebook


Here, we will explore a dataset containing the taxi trips made in New York City in 2013. Maintained by the New York City Taxi and Limousine Commission, this 50GB dataset contains the date, time, geographical coordinates of pickup and dropoff locations, fare, and other information for 170 million taxi trips.

To keep the analysis times reasonable, we will analyze a subset of this dataset containing 0.5% of all trips (about 850,000 rides). Compressed, this subset data represents a little less than 100MB. You are free to download and analyze the full dataset (or a larger subset), as explained below.

Provenance of the data

You will find the data subset we will be using in this chapter at https://github.com/ipython-books/minibook-2nd-data.

If you are interested in the original dataset containing all trips, you can refer to https://github.com/ipython-books/minibook-2nd-code/tree/master/chapter2/cleaning. This page contains the code to download the original dataset...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Learning IPython for Interactive Computing and Data Visualization, Second Edition
Published in: Oct 2015Publisher: ISBN-13: 9781783986989

Author (1)

author image
Cyrille Rossant

Cyrille Rossant, PhD, is a neuroscience researcher and software engineer at University College London. He is a graduate of École Normale Supérieure, Paris, where he studied mathematics and computer science. He has also worked at Princeton University and Collège de France. While working on data science and software engineering projects, he gained experience in numerical computing, parallel computing, and high-performance data visualization. He is the author of Learning IPython for Interactive Computing and Data Visualization, Second Edition, Packt Publishing.
Read more about Cyrille Rossant