Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Apache Spark Quick Start Guide

You're reading from  Apache Spark Quick Start Guide

Product type Book
Published in Jan 2019
Publisher Packt
ISBN-13 9781789349108
Pages 154 pages
Edition 1st Edition
Languages
Authors (2):
Shrey Mehrotra Shrey Mehrotra
Profile icon Shrey Mehrotra
Akash Grade Akash Grade
Profile icon Akash Grade
View More author details

Types of RDDs

RDDs can be categorized in multiple categories. Some of the examples include the following:

Hadoop RDD Shuffled RDD Pair RDD
Mapped RDD Union RDD JSON RDD
Filtered RDD Double RDD Vertex RDD

We will not discuss all of them in this chapter, as it is outside the scope of this chapter. But we will discuss one of the important types of RDD: pair RDDs.

Pair RDDs

A pair RDD is a special type of RDD that processes data in the form of key-value pairs. Pair RDD is very useful because it enables basic functionalities such as join and aggregations. Spark provides some special operations on these RDDs in an optimized way. If we recall the examples where we calculated the number of INFO and ERROR messages in...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}