In this chapter, we got familiar with obtaining a SparkContext
. We saw examples of using Hadoop MapReduce. We used SQL with Spark data. We combined data frames and operated on the resulting set. We imported JSON data and manipulated it with Spark. Lastly, we looked at using a pivot to gather information about a data frame.
In the next chapter, we will look at using R programming under Jupyter.