Reader small image

You're reading from  MongoDB Fundamentals

Product typeBook
Published inDec 2020
PublisherPackt
ISBN-139781839210648
Edition1st Edition
Tools
Concepts
Right arrow
Authors (4):
Amit Phaltankar
Amit Phaltankar
author image
Amit Phaltankar

Amit Phaltankar is a software developer and a blogger experienced in building lightweight and efficient software components. He specializes in wiring web-based applications and handling large-scale data sets using traditional SQL, NoSQL, and big data technologies. He is experienced in many technology stacks and loves learning and adapting to newer technology trends. Amit is passionate about improving his skill set and loves guiding and grooming his peers and contributing to blogs. He is also an author of MongoDB Fundamentals.
Read more about Amit Phaltankar

Juned Ahsan
Juned Ahsan
author image
Juned Ahsan

Juned Ahsan is a software professional with more than 14 years of experience. He has built software products and services for companies and clients such as Cisco, Nuamedia, IBM, Nokia, Telstra, Optus, Pizzahut, AT&T, Hughes, Altran, and others. Juned has a vast experience in building software products and architecting platforms of different sizes from scratch. He loves to help and mentor others and is a top 1% contributor on StackOverflow. He is passionate about cognitive CX, cloud computing, artificial intelligence, and NoSQL databases.
Read more about Juned Ahsan

Michael Harrison
Michael Harrison
author image
Michael Harrison

Michael Harrison started his career at the Australian telecommunications leader Telstra. He worked across their networks, big data, and automation teams. He is now a lead software developer and the founding member of Southbank Software, a Melbourne based startup that builds tools for the next generation of database technologies.
Read more about Michael Harrison

Liviu Nedov
Liviu Nedov
author image
Liviu Nedov

Liviu Nedov is a senior consultant with more than 20 years of experience in database technologies. He has provided professional and consulting services to customers in Australia and Europe. Throughout his career, he has designed and implemented large enterprise projects for customers like Wotif Group, Xstrata Copper/Glencore, and the University of Newcastle and Energy, Queensland. He is currently working at Data Intensity, which is the largest multi-cloud service provider for applications, databases, and business intelligence. In recent years, he is actively involved in MongoDB NoSQL database projects, database migrations, and cloud DBaaS (Database as a Service) projects.
Read more about Liviu Nedov

View More author details
Right arrow

9. Performance

Overview

This chapter introduces you to the concepts of query optimization and performance improvement in MongoDB. You will first explore the internal workings of query execution and identify the factors that can affect query performance, before moving on to database indexes and how indexes can reduce query execution time. You will also learn how to create, list, and delete indexes, and study the various types of indexes and their benefits. In the final sections, you will be introduced to various query optimization techniques that can help you use indexes effectively. By the end of this chapter, you will be able to analyze queries and use indexes and optimization techniques to improve query performance.

Introduction

In the previous chapters, we learned about the MongoDB query language and various query operators. We learned how to write queries to retrieve data. We also learned about various commands used to add and delete data and also to update or modify a piece of data. We ensured that the queries bring us the desired output; however, we did not pay much attention to their execution time and their efficiency. In this chapter, we will focus on how to analyze a query's performance and optimize its performance further, if needed.

Real-world applications are made up of multiple components, such as a user interface, processing components, databases, and more. The responsiveness of an application is dependent on the efficiency of each of these components. The database component performs different operations, such as saving, reading, and updating data. The amount of data a database table or collection stores, or the amount of data being pushed into or retrieved from a database...

Query Analysis

In order to write efficient queries, it is important to analyze them, find any possible performance issues, and fix them. This technique is called performance optimization. There are many factors that can negatively affect the performance of a query, such as incorrect scaling, incorrectly structured collections, and inadequate resources such as RAM and CPU. However, the biggest and most common factor is the difference between the number of records scanned and the number of records returned during the query execution. The greater the difference is, the slower the query will be. Thankfully, in MongoDB, this factor is the easiest to address and is done using indexes.

Creating and using indexes on a collection narrows down the number of records being scanned and improves the query performance noticeably. Before we delve further into indexes, though, we first need to cover the details of query execution.

Say you want to find a list of the movies released in the...

Introduction to Indexes

Databases can maintain and use indexes to make searches more efficient. In MongoDB, indexes are created on a field or a combination of fields. The database maintains a special registry of indexed fields and some of their data. The registry is easily searchable, as it maintains a logical link between the value of an indexed field and the respective documents in the collection. During a search operation, the database first locates the value in the registry and identifies the matching documents in the collection accordingly. The values in a registry are always sorted in ascending or descending order of the values, which helps during a range search and also while sorting the results.

To better understand how the index registry helps during searches, imagine you are searching for a theater by its ID, as follows:

db.theaters.find(
    {"theaterId" : 1009}
)

When the query is executed on the sample_mflix database, it returns a...

Creating and Listing Indexes

Indexes can be created by executing a createIndex() command on a collection, as follows:

db.collection.createIndex(
keys, 
options
)

The first argument to the command is a list of key-value pairs, where each pair consists of a field name and sort order, and the optional second argument is a set of options to control the indexes.

In a previous section, you wrote the following query to find all the movies released in 2015, sort them in descending order of the number of awards won, and print the title and number of wins:

db.movies.find(
    { 
        "year" : 2015
    },
    {
        "title" : 1, 
        "awards.wins" : 1
    }
).sort(
    {"awards.wins" : -1}
)

As the query uses a...

Query Analysis after Indexes

In the Query Analysis section, you analyzed the performance of a query that did not have suitable indexes to support its query condition. Because of this, the query scanned all 23539 documents in the collection to return 484 matching documents. Now that you have added an index on the year field, let's see how the query execution stats have changed.

The following query prints the execution statistics for the same query:

db.movies.explain("executionStats").find(
    { 
        "year" : 2015
    },
    {
        "title" : 1, 
        "awards.wins" : 1
    }
).sort(
    {"awards.wins" : -1}
)

The output for this is slightly different than the previous one, as shown in the following...

Hiding and Dropping Indexes

Dropping an index means removing the values of the fields from the index registry. Thus, any searches on the related fields will be performed in a linear fashion, provided there are no other indexes present on the field.

It is important to note that MongoDB does not allow updating an existing index. Thus, to fix an incorrectly created index, we need to drop it and recreate it correctly.

An index is deleted using the dropIndex function. It takes a single parameter, which can either be the index name or the index specification document, as follows:

db.collection.dropIndex(indexNameOrSpecification)

The index specification document is the definition of the index that is used to create it (like the following snippet, for example):

db.movies.createIndex(
    {title: 1}
)

Consider the following snippet:

db.movies.dropIndex(
     {title: 1}
)

This command drops the index on the title field of the movies...

Type of Indexes

We have seen how indexes help with query performance and how we can create, drop, and list indexes in the collection. MongoDB supports different types of indexes, such as single key, multikey, and compound indexes. Each of these indexes has different advantages that you will need to know before deciding which type is suitable for your collection. Let's start with a brief overview of default indexes.

Default Indexes

As seen in the previous chapters, each document in a collection has a primary key (namely, the _id field) and is indexed by default. MongoDB uses this index to maintain the uniqueness of the _id field, and it is available on all the collections.

Single-Key Indexes

An index created using a single field from a collection is called a single-key index. You used a single-key index earlier in this chapter. The syntax is as follows:

db.collection.createIndex({ field1: type}, {options})

Compound Indexes

Single-key indexes are preferable...

Properties of Indexes

In this section, we will cover different properties of indexes in MongoDB. An index property can influence the usage of an index and can also enforce some behavior on the collection. Index properties are passed as an option to the createdIndex function. We will be looking at unique indexes, TTL (time to live) indexes, sparse indexes, and finally, partial indexes.

Unique Indexes

A unique index property restricts the duplication of the index key. This is useful if you want to maintain the uniqueness of a field in a collection. The unique fields are useful for avoiding any ambiguity in identifying documents precisely. For example, in a license collection, a unique field such as license_number can help identify each document individually. This property enforces the behavior on the collection to reject duplicate entries. Unique indexes can be created on a single field or on a combination of fields. The following is the syntax to create a unique index on a single...

Other Query Optimization Techniques

So far, we have seen the internal workings of queries and how indexes help limit the number of documents to be scanned. We have also explored various types of indexes and their properties and learned how we can use the correct index and correct index properties in specific use cases. Creating the right index can improve query performance, but there are a few more techniques that are required to fine-tune the query performance. We will cover those techniques in this section.

Fetch Only What You Need

The performance of a query is also affected by the amount of data it returns. The database server and client communicate over a network. If a query produces a large amount of data, it will take longer to transfer it over a network. Moreover, to transfer the data over the network, it needs to be transformed and serialized by the server and deserialized by the receiving client. This means that the database client will have to wait longer to get the...

Summary

In this chapter, you practiced improving query performance. You first explored the internal workings of query execution and the query execution stages. You then learned how to analyze a query's performance and identify any existing problems based on the execution statistics. Next, you reviewed the concept of indexes; how they solve performance issues for a query; various ways to create, list, and delete indexes; different types of indexes; and their properties. In the final sections of this chapter, you studied query optimization techniques and got a brief look at the overheads associated with indexes. In the next chapter, you will learn about the concept of replication and how it is implemented in Mongo.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
MongoDB Fundamentals
Published in: Dec 2020Publisher: PacktISBN-13: 9781839210648
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (4)

author image
Amit Phaltankar

Amit Phaltankar is a software developer and a blogger experienced in building lightweight and efficient software components. He specializes in wiring web-based applications and handling large-scale data sets using traditional SQL, NoSQL, and big data technologies. He is experienced in many technology stacks and loves learning and adapting to newer technology trends. Amit is passionate about improving his skill set and loves guiding and grooming his peers and contributing to blogs. He is also an author of MongoDB Fundamentals.
Read more about Amit Phaltankar

author image
Juned Ahsan

Juned Ahsan is a software professional with more than 14 years of experience. He has built software products and services for companies and clients such as Cisco, Nuamedia, IBM, Nokia, Telstra, Optus, Pizzahut, AT&T, Hughes, Altran, and others. Juned has a vast experience in building software products and architecting platforms of different sizes from scratch. He loves to help and mentor others and is a top 1% contributor on StackOverflow. He is passionate about cognitive CX, cloud computing, artificial intelligence, and NoSQL databases.
Read more about Juned Ahsan

author image
Michael Harrison

Michael Harrison started his career at the Australian telecommunications leader Telstra. He worked across their networks, big data, and automation teams. He is now a lead software developer and the founding member of Southbank Software, a Melbourne based startup that builds tools for the next generation of database technologies.
Read more about Michael Harrison

author image
Liviu Nedov

Liviu Nedov is a senior consultant with more than 20 years of experience in database technologies. He has provided professional and consulting services to customers in Australia and Europe. Throughout his career, he has designed and implemented large enterprise projects for customers like Wotif Group, Xstrata Copper/Glencore, and the University of Newcastle and Energy, Queensland. He is currently working at Data Intensity, which is the largest multi-cloud service provider for applications, databases, and business intelligence. In recent years, he is actively involved in MongoDB NoSQL database projects, database migrations, and cloud DBaaS (Database as a Service) projects.
Read more about Liviu Nedov