Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Learning Apache Spark 2

You're reading from  Learning Apache Spark 2

Product type Book
Published in Mar 2017
Publisher Packt
ISBN-13 9781785885136
Pages 356 pages
Edition 1st Edition
Languages

Table of Contents (18) Chapters

Learning Apache Spark 2
Credits
About the Author
About the Reviewers
www.packtpub.com
Customer Feedback
Preface
1. Architecture and Installation 2. Transformations and Actions with Spark RDDs 3. ETL with Spark 4. Spark SQL 5. Spark Streaming 6. Machine Learning with Spark 7. GraphX 8. Operating in Clustered Mode 9. Building a Recommendation System 10. Customer Churn Prediction 1. Theres More with Spark

Operations on RDD


Two major operation types can be performed on an RDD. They are called:

  • Transformations
  • Actions

Transformations

Transformations are operations that create a new dataset, as RDDs are immutable. They are used to transform data from one to another, which could result in amplification of the data, reduction of the data, or a totally different shape altogether. These operations do not return any value back to the driver program, and hence are lazily evaluated, which is one of the main benefits of Spark.

An example of a transformation would be a map function that will pass through each element of the RDD and return a totally new RDD representing the results of application of the function on the original dataset.

Actions

Actions are operations that return a value to the driver program. As previously discussed, all transformations in Spark are lazy, which essentially means that Spark remembers all the transformations carried out on an RDD, and applies them in the most optimal fashion...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}