Reader small image

You're reading from  Fast Data Processing with Spark 2 - Third Edition

Product typeBook
Published inOct 2016
Reading LevelBeginner
PublisherPackt
ISBN-139781785889271
Edition3rd Edition
Languages
Right arrow
Author (1)
Holden Karau
Holden Karau
author image
Holden Karau

Holden Karau is a software development engineer and is active in the open source. She has worked on a variety of search, classification, and distributed systems problems at IBM, Alpine, Databricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor's of mathematics degree in computer science. Other than software, she enjoys playing with fire and hula hoops, and welding.
Read more about Holden Karau

Right arrow

Shared Java and Scala APIs


Once you have a SparkSession object created, it will serve as your main entry point. In the next chapter, you will learn how to use the SparkSession object to load and save data. You can also use SparkSession.SparkContext to launch more Spark jobs and add or remove dependencies. Some of the non-data-driven methods you can use on the SparkSession.SparkContext object are shown here:

Method

Use

addJar(path)

This method adds the JAR file for all the future jobs that would run through the SparkContext object.

addFile(path)

This method downloads the file to all the nodes on the cluster.

listFiles/listJars

This method shows the list of all the currently added files/JARs.

stop()

This method shuts down SparkContext.

clearFiles()

This method removes the files so that new nodes will not download them.

clearJars()

This method removes the JARs from being required for future jobs.

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Fast Data Processing with Spark 2 - Third Edition
Published in: Oct 2016Publisher: PacktISBN-13: 9781785889271

Author (1)

author image
Holden Karau

Holden Karau is a software development engineer and is active in the open source. She has worked on a variety of search, classification, and distributed systems problems at IBM, Alpine, Databricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor's of mathematics degree in computer science. Other than software, she enjoys playing with fire and hula hoops, and welding.
Read more about Holden Karau