Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Apache Spark Graph Processing

You're reading from  Apache Spark Graph Processing

Product type Book
Published in Sep 2015
Publisher
ISBN-13 9781784391805
Pages 148 pages
Edition 1st Edition
Languages

Joining graph datasets


In addition to the previous mapping and filtering operations, GraphX also provides APIs for joining RDD datasets with graphs. This can be useful when we want to add extra information to the vertex attributes of a graph or when we want to merge the vertex attributes of two related graphs. These tasks can be accomplished using the following join operators.

joinVertices

The following is the method signature for the first operator joinVertices:

def joinVertices[U](table: RDD[(VertexId, U)])(map: (VertexId, VD, U) => VD): Graph[VD, ED]

It is invoked on a Graph[VD, ED] object and requires two inputs, which are passed as curried parameters. First, joinVertices joins a graph's vertex attributes with an input vertex RDD table of type RDD[(VertexId, U)]. Second, a user-defined map function is also passed to joinVertices. This map function joins the original and passed attributes of each vertex into a new attribute. The return type of this new attribute must be the same as the...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}