Setting up the Jupyter Notebook
The following steps are required before getting started with the exercises:
- Import all the required modules and packages in the Jupyter notebook: - import findspark findspark.init() import pyspark import random 
- Now, use the following command to set up SparkContext: - from pyspark import SparkContext sc = SparkContext() 
- Similarly, use the following command to set up SQLContext in the Jupyter notebook: - from pyspark.sql import SQLContext sqlc = SQLContext(sc) - Note- Make sure you have the PySpark CSV reader package from the Databricks website (https://databricks.com/) installed and ready before executing the next command. If not, then download it using the following command: - pyspark –packages com.databricks:spark-csv_2.10:1.4.0 
- Read the Iris dataset from the CSV file into a Spark DataFrame: - df = sqlc.read.format('com.databricks.spark.csv').options(header = 'true', inferschema = 'true').load('/Users/iris.csv')- The output of the preceding command is as follows: - df.show(5)  - Figure 5... 
 
                                             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
            ![Big Data Analysis with Python [Instructor Edition]](https://content.packt.com/C12913/cover_image.png) 
     
         
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                