Reader small image

You're reading from  Fast Data Processing with Spark 2 - Third Edition

Product typeBook
Published inOct 2016
Reading LevelBeginner
PublisherPackt
ISBN-139781785889271
Edition3rd Edition
Languages
Right arrow
Author (1)
Holden Karau
Holden Karau
author image
Holden Karau

Holden Karau is a software development engineer and is active in the open source. She has worked on a variety of search, classification, and distributed systems problems at IBM, Alpine, Databricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor's of mathematics degree in computer science. Other than software, she enjoys playing with fire and hula hoops, and welding.
Read more about Holden Karau

Right arrow

Algorithms


Now we dive into the most interesting part of GraphX: algorithms and the graph parallel computation APIs to implement more algorithms. The following table shows a bird's eye view of the algorithms:

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Fast Data Processing with Spark 2 - Third Edition
Published in: Oct 2016Publisher: PacktISBN-13: 9781785889271

Author (1)

author image
Holden Karau

Holden Karau is a software development engineer and is active in the open source. She has worked on a variety of search, classification, and distributed systems problems at IBM, Alpine, Databricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor's of mathematics degree in computer science. Other than software, she enjoys playing with fire and hula hoops, and welding.
Read more about Holden Karau

Type

GraphX method/example

Graph-Parallel Computation

The method is aggregateMessages(), Function

Pregel(). Refer to https://issues.apache.org/jira/browse/SPARK-5062 for examples.

PageRank

The method is PageRank(). As an example, refer to the influential papers in a citation network, Influencer in retweet. You can specifically check out the following:

staticPageRank(): This provides a static no of iterations and dynamic tolerance; see the parameters (tol versus numIter)

personalizedPageRank(): This is a variation of PageRank that gives a rank relative to a specified "source" vertex in the graph-People

You May Know ShortestPaths and SVD++

The methods are ShortestPaths() and SVD++. As an example, consider the fact that SDV++ takes an RDD of edges.

LabelPropagation (LPA...