Reader small image

You're reading from  Graph Data Processing with Cypher

Product typeBook
Published inDec 2022
PublisherPackt
ISBN-139781804611074
Edition1st Edition
Concepts
Right arrow
Author (1)
Ravindranatha Anthapu
Ravindranatha Anthapu
author image
Ravindranatha Anthapu

Ravindranatha Anthapu has more than 25 years of experience in working with W3C standards or building cutting-edge technologies like integrating speech with mobile applications in the 2000s. He is a technology enthusiast who worked from operating system device drivers to writing compilers for C language to modern web technologies, transitioning seamlessly and bringing experience from each of these domains and technologies to deliver successful solutions today. As a Principal Consultant at Neo4j today, Ravindranatha works with large enterprise customers to make sure they are able to leverage graph technologies effectively across various domains.
Read more about Ravindranatha Anthapu

Right arrow

Query Tuning

Till now, we have looked at various aspects of building Cypher queries. In this chapter, we will take a look at which options are available to profile and tune queries.

We have two options available to tune queries. The first one is the EXPLAIN clause, which takes the Cypher query and provides an estimated amount of work the database might do. It does not execute the query. The other one is the PROFILE clause. This will execute the query and gives the exact amount of work the database is doing. We will review both of these options in detail to understand how to leverage them to tune queries.

We will be taking a look at these aspects:

  • Working with EXPLAIN
  • Working with PROFILE
  • Reviewing plan operators
  • Using index hints

First, let’s take a look at how to use EXPLAIN to tune queries.

Working with EXPLAIN

The EXPLAIN clause returns a query execution plan as the database sees it. It does not actually execute the query and does not return data. It does not make any changes to the database itself. It can be used to easily identify missing indexes or query plans not using existing indexes, and so on. Also, it tells us how a query is traversing relationships to do its work.

Let’s look at a basic example of EXPLAIN usage:

EXPLAIN MATCH (d {code:'313820'})
RETURN d

This query is trying to find a node with a code property matching the provided value. Notice that we did not provide a label in the query. This is intentional to showcase how a query plan can be used to identify issues.

We can see from the following screenshot that the first step we are doing is AllNodesScan. What this means is that we are looking at the whole database to find the node we want and that we either have a mistake in the query or we are missing an index:

...

Working with PROFILE

The PROFILE clause executes as per the plan created by the database and provides the exact cost of the query. It will keep track of how many rows pass through the operators and the amount of work the database is doing that is measured as database hits.

Let’s look at a basic example and compare the EXPLAIN and PROFILE plans:

PROFILE MATCH (d {code:'313820'})
RETURN d

In the query, we can see that the only difference is we are using PROFILE instead of EXPLAIN:

Figure 9.6 – Basic query without EXPLAIN and PROFILE label comparison

We can see that the plan remained exactly the same, but we are seeing the db hits, which are the measurement of how much work the database is doing. You can think of 1 db hit as a unit of work the database is doing. More db hits mean a plan step is taking more work than the database is doing to complete that step in the query. Also, more db hits mean that it’s taking more time...

Reviewing plan operators

We will take a look at a few important operators that we will encounter most often while tuning queries.

These operators impact query performance the most and are listed as follows:

  • AllNodesScan: This means the database is looking at all the nodes. Most often than not, this means the user forgot to provide a label. There are scenarios where this might be the intention. Most users when they are beginning with Cypher add the label name but forget to add ":" before the label. An example looks like this:
    MATCH (Drug {code: '1234'} RETURN *

We might have thought we provided the label name, but since we don’t have ":" before the label, it will be treated as a variable, and we will see all nodes are being looked at.

  • CartesianProduct: This is a scenario when we have multiple MATCH statements without the WITH clause between them. A sample of such a scenario looks like this:
    MATCH (p:Patient)
    MATCH (d:Drug)
    RETURN...

Using index hints

Index hints are used to specify which indexes the planner should try to use. This can change how the query gets executed. There might be a trade-off compared to the default planning when using index hints.

We can provide index hints to the planner with a USING clause. Let’s take a sample query that uses multiple entities and review a basic query profile and index hint driver query profile:

MATCH path=
(p:Patient {id:'7361ce15-cf67-ae76-88e6-bcbdca19ce0b'})
-[:HAS_ENCOUNTER]->()
-[:HAS_DRUG]->(d:Drug {code:'1190795'})
RETURN path LIMIT 1

We will take the preceding query and get a basic query profile first:

PROFILE MATCH path=
(p:Patient {id:'7361ce15-cf67-ae76-88e6-bcbdca19ce0b'})
-[:HAS_ENCOUNTER]->()
-[:HAS_DRUG]->(d:Drug {code:'1190795'})
RETURN path LIMIT 1

The following screenshot represents a basic query profile:

Figure 9.11 – Basic query profile without hints...

Summary

In this chapter, we looked at using EXPLAIN and PROFILE to understand how queries will be executed by the database by looking at a plan and actual work done by the database in terms of db hits. We looked at some query plans for some basic queries and a complex query to understand the difference between EXPLAIN and PROFILE plans.

We looked at using index hints and how these affect the query execution time, along with the amount of work the database is doing to execute the query. We also reviewed a few important plan operators we need to look out for.

In the next chapter, we will take a look at Awesome Procedures on Cypher (APOC) functions and procedures and how they help us with querying.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Graph Data Processing with Cypher
Published in: Dec 2022Publisher: PacktISBN-13: 9781804611074
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Ravindranatha Anthapu

Ravindranatha Anthapu has more than 25 years of experience in working with W3C standards or building cutting-edge technologies like integrating speech with mobile applications in the 2000s. He is a technology enthusiast who worked from operating system device drivers to writing compilers for C language to modern web technologies, transitioning seamlessly and bringing experience from each of these domains and technologies to deliver successful solutions today. As a Principal Consultant at Neo4j today, Ravindranatha works with large enterprise customers to make sure they are able to leverage graph technologies effectively across various domains.
Read more about Ravindranatha Anthapu