Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Learning Jupyter 5 - Second Edition

You're reading from  Learning Jupyter 5 - Second Edition

Product type Book
Published in Aug 2018
Publisher
ISBN-13 9781789137408
Pages 282 pages
Edition 2nd Edition
Languages

Table of Contents (18) Chapters

Title Page
Packt Upsell
Contributors
Preface
1. Introduction to Jupyter 2. Jupyter Python Scripting 3. Jupyter R Scripting 4. Jupyter Julia Scripting 5. Jupyter Java Coding 6. Jupyter JavaScript Coding 7. Jupyter Scala 8. Jupyter and Big Data 9. Interactive Widgets 10. Sharing and Converting Jupyter Notebooks 11. Multiuser Jupyter Notebooks 12. What's Next? 1. Other Books You May Enjoy Index

Spark evaluating history data


In this example, we combine the previous sections to look at some historical data and determine a number of useful attributes.

The historical data we are using is the guest list for the Jon Stewart television show. A typical record from the data looks as follows:

1999,actor,1/11/99,Acting,Michael J. Fox 

This contains the year, the occupation of the guest, the date of appearance, a logical grouping of the occupations, and the name of the guest.

For our analysis, we will be looking at the number of appearances per year, the occupation that appears most frequently, and the personality who appears most frequently.

We will be using this script:

#Spark Daily Show Guests
import pyspark
import csv
import operator
import itertools
import collections

if not 'sc' in globals():
 sc = pyspark.SparkContext()

years = {}
occupations = {}
guests = {}

#file header contains column descriptors:
#YEAR, GoogleKnowledge_Occupation, Show, Group, Raw_Guest_List

with open('daily_show_guests...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}