HandsOn Graph Analytics with Neo4j
 FREE Subscribe Start Free Trial
 $44.99 Print + eBook Buy
 $31.99 eBook Buy
 Instant online access to over 8,000+ books and videos
 Constantly updated with 100+ new titles each month
 Breadth and depth in over 1,000+ technologies

Section 1: Graph Modeling with Neo4j

Graph Databases
 Graph Databases
 Graph definition and examples
 Graph theory
 A bit of history: the Seven Bridges of Königsberg problem
 Graph definition
 Visualization
 Examples of graphs
 Networks
 Road networks
 Computer networks
 Social networks
 Your data is also a graph
 Moving from SQL to graph databases
 Database models
 SQL and joins
 It's all about relationships
 Neo4j – the nodes, relationships, and properties model
 Building blocks
 Nodes
 Relationships
 Properties
 SQL to Neo4j translator
 Neo4j use cases
 Understanding graph properties
 Directed versus undirected
 Weighted versus unweighted
 Cyclic versus acyclic
 Dense versus sparse
 Graph traversal
 Connected versus disconnected
 Considerations for graph modeling in Neo4j
 Relationship orientation
 Node or property?
 Summary
 Further reading

The Cypher Query Language
 The Cypher Query Language
 Technical requirements
 Creating nodes and relationships
 Managing databases with Neo4j Desktop
 Creating a node
 Selecting nodes
 Filtering
 Returning properties
 Creating a relationship
 Selecting relationships
 The MERGE keyword
 Updating and deleting nodes and relationships
 Updating objects
 Updating an existing property or creating a new one
 Updating all properties of the node
 Updating node labels
 Deleting a node property
 Deleting objects
 Pattern matching and data retrieval
 Pattern matching
 Test data
 Graph traversal
 Orientation
 The number of hops
 Variablelength patterns
 Optional matches
 Using aggregation functions
 Count, sum, and average
 Creating a list of objects
 Unnesting objects
 Importing data from CSV or JSON
 Data import from Cypher
 File location
 Local file: the import folder
 Changing the default configuration to import a file from another directory
 CSV files
 CSV files without headers
 CSV files with headers
 Eager operations
 Data import from the command line
 APOC utilities for imports
 CSV files
 JSON files
 Importing data from a web API
 Setting parameters
 Calling the GitHub web API
 Summary of import methods
 Measuring performance and tuning your query for speed
 Cypher query planner
 Neo4j indexing
 Back to LOAD CSV
 The friendoffriend example
 Summary
 Questions
 Further reading

Empowering Your Business with Pure Cypher
 Empowering Your Business with Pure Cypher
 Technical requirements
 Knowledge graphs
 Attempting a definition of knowledge graphs
 Building a knowledge graph from structured data
 Building a knowledge graph from unstructured data using NLP
 NLP
 Neo4j tools for NLP
 GraphAware NLP library
 Importing test data from the GitHub API
 Enriching the graph with NLP
 Adding context to a knowledge graph from Wikidata
 Introducing RDF and SPARQL
 Querying Wikidata
 Importing Wikidata into Neo4j
 Enhancing a knowledge graph from semantic graphs
 Graphbased search
 Search methods
 Manually building Cypher queries
 Automating the English to Cypher translation
 Using NLP
 Using translationlike models
 Recommendation engine
 Product similarity recommendations
 Products in the same category
 Products frequently bought together
 Recommendation ordering
 Social recommendations
 Products bought by a friend of mine
 Summary
 Questions
 Further reading

Section 2: Graph Algorithms

The Graph Data Science Library and Path Finding
 The Graph Data Science Library and Path Finding
 Technical requirements
 Introducing the Graph Data Science plugin
 Extending Neo4j with custom functions and procedures
 The difference between procedures and functions
 Functions
 Procedures
 Writing a custom function in Neo4j
 GDS library content
 Defining the projected graph
 Native projections
 Cypher projections
 Streaming or writing results back to the graph
 Understanding the importance of shortest path algorithms through their applications
 Routing within a network
 GPS
 The shortest path within a social network
 Other applications
 Video games
 Science
 Dijkstra's shortest paths algorithm
 Understanding the algorithm
 Running Dijkstra's algorithm on a simple graph
 Example implementation
 Graph representation
 Algorithm
 Displaying the full path from A to E
 Using the shortest path algorithm within Neo4j
 Path visualization
 Understanding relationship direction
 Finding the shortest path with the A* algorithm and its heuristics
 Algorithm principles
 Defining the heuristics for A*
 Using A* within the Neo4j GDS plugin
 Discovering the other pathrelated algorithms in the GDS plugin
 Kshortest path
 Single Source Shortest Path (SSSP)
 Allpairs shortest path
 Optimizing processes using graphs
 The travelingsalesman problem
 Spanning trees
 Prim's algorithm
 Finding the minimum spanning tree in a Neo4j graph
 Summary
 Questions
 Further reading

Spatial Data
 Spatial Data
 Technical requirements
 Representing spatial attributes
 Understanding geographic coordinate systems
 Using the Neo4j builtin spatial types
 Creating points
 Querying by distance
 Creating a geometry layer in Neo4j with neo4jspatial
 Introducing the neo4jspatial library
 A note on spatial indexes
 Creating a spatial layer of points
 Defining the spatial layer
 Adding points to a spatial layer
 Defining the type of spatial data
 Creating layers with polygon geometries
 Getting the data
 Creating the layer
 Performing spatial queries
 Finding the distance between two spatial objects
 Finding objects contained within other objects
 Finding the shortest path based on distance
 Importing the data
 Preparing the data
 Importing data
 Creating a spatial layer
 Running the shortest path algorithm
 Visualizing spatial data with Neo4j
 neomap – a Neo4j Desktop application for spatial data
 Visualizing nodes with simple layers
 Visualizing paths with advanced layer
 Using the JavaScript Neo4j driver to visualize shortest paths
 Neo4j JS driver
 Leaflet and GeoJSON
 Summary
 Questions
 Further reading

Node Importance
 Node Importance
 Technical requirements
 Defining importance
 Popularity and information spread
 Critical or bridging nodes
 Computing degree centrality
 Formula
 Computing degree centrality in Neo4j
 Computing the outgoing degree using GDS
 Computing the incoming degree using GDS
 Using a named projected graph
 Using an anonymous projected graph
 Understanding the PageRank algorithm
 Building the formula
 The damping factor
 Normalization
 Running the algorithm on an example graph
 Implementing the PageRank algorithm using Python
 Using GDS to assess PageRank centrality in Neo4j
 Comparing degree centrality and the PageRank results
 Variants
 ArticleRank
 Personalized PageRank
 Eigenvector centrality
 The adjacency matrix
 PageRank with matrix notation
 Eigenvector centrality
 Computing eigenvector centrality in GDS
 Pathbased centrality metrics
 Closeness centrality
 Normalization
 Computing closeness from the shortest path algorithms
 The closeness centrality algorithm
 Closeness centrality in multiplecomponent graphs
 Betweenness centrality
 Comparing centrality metrics
 Applying centrality to fraud detection
 Detecting fraud using Neo4j
 Using centrality to assess fraud
 Creating a projected graph with Cypher projection
 Other applications of centrality algorithms
 Summary
 Exercises
 Further reading

Community Detection and Similarity Measures
 Community Detection and Similarity Measures
 Technical requirements
 Introducing community detection and its applications
 Identifying clusters of nodes
 Applications of the community detection method
 Recommendation engines and targeted marketing
 Clusters of products
 Clusters of users
 Fraud detection
 Predicting properties or links
 A brief overview of community detection techniques
 Detecting graph components and visualizing communities
 Weakly connected components
 Strongly connected components
 Writing the GDS results in the graph
 Visualizing a graph with neovis.js
 Using NEuler, the Graph Data Science Playground
 Usage for community detection visualization
 Running the Label Propagation algorithm
 Defining Label Propagation
 Weighted nodes and relationships
 Semisupervised learning
 Implementing Label Propagation in Python
 Using the Label Propagation algorithm from the GDS
 Using seeds
 Writing results to the graph
 Understanding the Louvain algorithm
 Defining modularity
 All nodes are in their own community
 All nodes are in the same community
 Optimal partition
 Steps to reproduce the Louvain algorithm
 The Louvain algorithm in the GDS
 Syntax
 The aggregation method in relationship projection
 Intermediate steps
 A comparison between Label Propagation and Louvain on the Zachary's karate club graph
 Going beyond Louvain for overlapping community detection
 A caveat of the Louvain algorithm
 Resolution limit
 Alternatives to Louvain
 Overlapping community detection
 Dynamic networks
 Measuring the similarity between nodes
 Setbased similarities
 Overlapping
 Definition
 Quantifying user similarity in the GitHub graph
 Jaccard similarity
 Vectorbased similarities
 Euclidean distance
 Cosine similarity
 Summary
 Questions
 Further reading

Section 3: Machine Learning on Graphs

Using Graphbased Features in Machine Learning
 Using Graphbased Features in Machine Learning
 Technical requirements
 Building a data science project
 Problem definition – asking the right question
 Supervised versus unsupervised learning
 Regression versus classification
 Introducing the problem for this chapter
 Getting and cleaning data
 Data characterization
 Quantifying the dataset size
 Labels
 Columns
 Data visualization
 Data cleaning
 Outliers detection
 Missing data
 Correlation between variables
 Data enrichment
 Feature engineering
 Building the model
 Train/test split and crossvalidation
 Creating the train and test samples with scikitlearn
 Training a model
 Evaluating model performances
 The steps toward graph machine learning
 Building a (knowledge) graph
 Creating relationships from existing data
 Creating relationships from relational data
 Creating relationships from Neo4j
 Using an external data source
 Importing the data into Neo4j
 Graph characterization
 The number of nodes and edges
 The number of components
 Extracting graphbased features
 Using graphbased features with pandas and scikitlearn
 Extracting graphbased features from Neo4j Browser
 Creating the projected graph
 Running one or several algorithms
 Dropping the projected graph
 Extracting the data
 Automating graphbased feature creation with the Neo4j Python driver
 Discovering the Neo4j Python driver
 Basic usage
 Transactions
 Automating graphbased feature creation with Python
 Creating the projected graph
 Calling the GDS procedures
 Writing results back to the graph
 Dropping the projected graph
 Exporting the data from Neo4j to pandas
 Training a scikitlearn model
 Introducing community features
 Using both community and centrality features
 Summary
 Questions
 Further reading

Predicting Relationships
 Predicting Relationships
 Technical requirements
 Why use link prediction?
 Dynamic graphs
 Applications
 Recovering missing data
 Fighting crime
 Research
 Making recommendations
 Social links (Facebook friends, LinkedIn contacts...)
 Product recommendations
 Making recommendations using a link prediction algorithm
 Creating link prediction metrics with Neo4j
 Communitybased metrics
 Pathrelated metrics
 Distance between nodes
 The Katz index
 Using local neighborhood information
 Common neighbors
 AdamicAdar
 Total neighbors
 Preferential attachment
 Other metrics
 Building a link prediction model using an ROC curve
 Importing the data into Neo4j
 Splitting the graph and computing the score for each edge
 Measuring binary classification model performance
 Understanding ROC curves
 Extracting features and labels
 Drawing the ROC curve
 Creating the DataFrame
 Plotting the ROC curve
 Determining the optimal cutoff and computing performances
 Building a more complex model using scikitlearn
 Saving link prediction results into Neo4j
 Predicting relationships in bipartite graphs
 Summary
 Questions
 Further reading

Graph Embedding  from Graphs to Matrices
 Graph Embedding  from Graphs to Matrices
 Technical requirements
 Why do we need embedding?
 Why is embedding needed?
 Onehot encoding
 Creating features for words – the manual way
 Embedding specifications
 The graph embedding landscape
 Adjacencybased embedding
 The adjacency matrix and graph Laplacian
 Eigenvectors embedding
 Locally linear embedding
 Similaritybased embedding
 HighOrder Proximity preserved Embedding (HOPE)
 Computing node embedding with Python
 Creating a networkx graph
 The Neo4j test graph
 Extracting the edge list data from Neo4j
 Creating a networkx graph matrix from pandas
 Fitting a node embedding algorithm
 Extracting embeddings from artificial neural networks
 Artificial neural networks in a nutshell
 A reminder about neural network principles
 Neurons, layers, and forward propagation
 Different types of neural networks
 Skipgraph model
 Fake task
 Input
 Word representation before embedding
 Target
 Hidden layer
 Output layer
 DeepWalk node embedding
 Generating node context through random walks
 Generating random walks from the GDS
 DeepWalk embedding with karateclub
 Node2vec, a DeepWalk alternative
 Node2vec from the GDS (≥ 1.3)
 Getting the embedding results from Python
 Graph neural networks
 Extending the principles of CNNs and RNNs to build GNNs
 Message propagation and aggregation
 Taking into account node properties
 Applications of GNNs
 Image analysis
 Video analysis
 Zeroshot learning
 Text analysis
 And there's more...
 Using GNNs in practice
 GNNs from the GDS – GraphSAGE
 Going further with graph algorithms
 Stateoftheart graph algorithms
 Summary
 Questions
 Further reading

Section 4: Neo4j for Production

Using Neo4j in Your Web Application
 Using Neo4j in Your Web Application
 Technical requirements
 Creating a fullstack web application using Python and Graph Object Mappers
 Toying with neomodel
 Defining the properties of structured nodes
 StructuredNode versus SemiStructuredNode
 Adding properties
 Creating nodes
 Querying nodes
 Filtering nodes
 Integrating relationship knowledge
 Simple relationship
 Relationship with properties
 Building a web application backed by Neo4j using Flask and neomodel
 Creating toy data
 Login page
 Creating the Flask application
 Adapting the model
 The login form
 The login template
 The login view
 Reading data – listing owned repositories
 Altering the graph – adding a contribution
 Understanding GraphQL APIs by example – GitHub API v4
 Endpoints
 Returned attributes
 Query parameters
 Mutations
 Developing a React application using GRANDstack
 GRANDstack – GraphQL, React, Apollo, and Neo4j Database
 Creating the API
 Writing the GraphQL schema
 Defining types
 Starting the application
 Testing with the GraphQL playground
 Calling the API from Python
 Using variables
 Mutations
 Building the user interface
 Creating a simple component
 Getting data from the GraphQL API
 Writing a simple component
 Adding navigation
 Mutation
 Refreshing data after the mutation
 Summary
 Questions
 Further reading

Neo4j at Scale
 Neo4j at Scale
 Technical requirements
 Measuring GDS performance
 Estimating memory usage with the estimate procedures
 Estimating projected graph memory usage
 Fictive graph
 Graph defined by native or Cypher projection
 Estimating algorithm memory usage
 The stats running mode
 Measuring time performances for some of the algorithms
 Configuring Neo4j 4.0 for big data
 The landscape prior to Neo4j 4.0
 Memory settings
 Neo4j in the cloud
 Sharding with Neo4j 4.0
 Defining shards
 Creating the databases
 Querying a sharded graph
 The USE statement
 Querying all databases
 Summary

Other Books You May Enjoy
About this book
Neo4j is a graph database that includes plugins to run complex graph algorithms.
The book starts with an introduction to the basics of graph analytics, the Cypher query language, and graph architecture components, and helps you to understand why enterprises have started to adopt graph analytics within their organizations. You’ll find out how to implement Neo4j algorithms and techniques and explore various graph analytics methods to reveal complex relationships in your data. You’ll be able to implement graph analytics catering to different domains such as fraud detection, graphbased search, recommendation systems, social networking, and data management. You’ll also learn how to store data in graph databases and extract valuable insights from it. As you become wellversed with the techniques, you’ll discover graph machine learning in order to address simple to complex challenges using Neo4j. You will also understand how to use graph data in a machine learning model in order to make predictions based on your data. Finally, you’ll get to grips with structuring a web application for production using Neo4j.
By the end of this book, you’ll not only be able to harness the power of graphs to handle a broad range of problem areas, but you’ll also have learned how to use Neo4j efficiently to identify complex relationships in your data.
 Publication date:
 August 2020
 Publisher
 Packt
 Pages
 510
 ISBN
 9781839212611