Reader small image

You're reading from  Learning Spark SQL

Product typeBook
Published inSep 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781785888359
Edition1st Edition
Languages
Right arrow

Introducing large-scale graph applications


Analysis of graphs based on large Datasets is becoming increasingly important in various areas, such as social networks, communication networks, citation networks, web graphs, transport networks, product co-purchasing networks, and so on. Typically, graphs are created from source in a tabular or relational format, and then applications, such as search and graph algorithms, are run on them to derive key insights.

GraphFrames provide a declarative API that can be used for both interactive queries and standalone programs on large-scale graphs. As GraphFrames are implemented on top of Spark SQL, it enables parallel processing and optimization across the computation:

The main programming abstraction in GraphFrame's API is a GraphFrame. Conceptually, it consists of two DataFrames representing the vertices and edges of the graph. The vertices and edges may have multiple attributes, which can also be used in queries. For example, in a social network, the...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Learning Spark SQL
Published in: Sep 2017Publisher: PacktISBN-13: 9781785888359