Reader small image

You're reading from  Practical MongoDB Aggregations

Product typeBook
Published inSep 2023
PublisherPackt
ISBN-139781835080641
Edition1st Edition
Tools
Right arrow
Author (1)
Paul Done
Paul Done
author image
Paul Done

Paul Done is a Field CTO at MongoDB Inc., having been a Solutions Architect for the past decade at MongoDB. He has previously held roles in various software disciplines, including engineering, consulting, and pre-sales, at companies like Oracle, Novell, and BEA Systems. Paul specializes in databases and middleware, focusing on resiliency, scalability, transactions, event processing, and applying evolvable data model approaches. He spent most of the early 2000s building Java EE (J2EE) transactional systems on WebLogic, integrated with relational databases like Oracle RAC and messaging systems like MQ Series.
Read more about Paul Done

Right arrow

Harnessing the Power of Expressions

In this chapter, you will learn about the different types of aggregation expressions, how to combine them, and how they can help you enhance your aggregation pipelines. Using nested expressions can be highly effective for solving complex problems, particularly those involving arrays. Since nesting introduces added complexity, this chapter devotes significant attention to guiding you through the intricacies of crafting composite expressions for array processing.

To summarize, you will learn the following key concepts in this chapter:

  • Types of aggregation expressions
  • How to chain expressions together
  • The power array operators
  • Conditional comparisons
  • Techniques for looping through and processing array elements

Let's begin by exploring the various types of aggregation expressions.

Aggregation expressions explained

Aggregation expressions provide syntax and a library of commands to allow you to perform sophisticated data operations within many of the stages you include in your aggregation pipelines. You can use expressions within the pipeline to perform tasks such as the following:

  • Compute values (e.g., calculate the average value of an array of numbers)
  • Convert an input field's value (e.g., a string) into an output field's value (e.g., a date)
  • Extract the specific reoccurring field's value from an array of sub-documents into a new list of values
  • Transform the shape of an input object into an entirely differently structured output object

In many cases, you can nest expressions within other expressions, enabling a high degree of sophistication in your pipelines, albeit sometimes at the cost of making your pipelines appear complex.

You can think of an aggregation expression as being one of three possible types:

    ...

What do expressions produce?

An expression can be an operator (e.g., {$concat: ...}), a variable (e.g., "$$ROOT"), or a field path (e.g., "$address"). In all these cases, an expression is just something that dynamically populates and returns a new element, which can be one of the following types:

  • Number (including integer, long, float, double, and decimal128)
  • String (UTF-8)
  • Boolean
  • DateTime (UTC)
  • Array
  • Object

However, a specific expression can restrict you to returning just one or a few of these types. For example, the {$concat: ...} operator, which combines multiple strings, can only produce a string data type (or null). The "$$ROOT" variable can only return an object that refers to the root document currently being processed in the pipeline stage.

A field path (e.g., "$address") is different and can return an element of any data type, depending on what the field refers to in the current input document....

Can all stages use expressions?

There are many types of stages in the aggregation framework that don't allow expressions to be embedded. Here are some examples of some of the most popular of these stages:

  • $match
  • $limit
  • $skip
  • $sort
  • $count
  • $lookup
  • $out

Some of these stages may be a surprise to you if you've never really thought about it before. You might consider $match to be the most surprising item in this list. The content of a $match stage is just a set of query conditions with the same syntax as MongoDB Query Language rather than an aggregation expression. There is a good reason for this. The aggregation engine reuses the MongoDB Query Language query engine to perform a regular query against the collection, enabling the query engine to use all its usual optimizations. The query conditions are taken as-is from the $match stage at the top of the pipeline. Therefore, the $match filter must use the same syntax as MongoDB Query Language...

Advanced use of expressions for array processing

One of the most compelling aspects of MongoDB is the ability to embed arrays within documents. Unlike relational databases, this characteristic allows each entity's entire data structure to exist in one place as a document.

The aggregation framework provides a rich set of aggregation operator expressions for analyzing and manipulating arrays. When optimizing for performance, array expressions are critical because they prevent the unwinding and regrouping of documents when you only need to process each document's array in isolation. For most situations when you need to manipulate an array, there is usually a single array operator expression that you can utilize for your requirements.

Occasionally, you won't be able to use a single out-of-the-box array operator expression to solve an array processing challenge. Consequently, you must assemble a composite of nested lower-level expressions to handle the challenging array...

Summary

In this chapter, you started your journey with basic aggregation expressions. You explored the different types of expressions and how to combine them using nesting to solve complex data transformations. Then you moved on to bootstrapping this knowledge to undertake typically complicated tasks related to mutating arrays and extracting detail from the contents of arrays. There was a particular focus on techniques for looping through array elements efficiently without necessarily having to resort to unwinding and regrouping documents, where you only need to process each document's array in isolation.

The next chapter will enable you to understand the impact of sharding on aggregation pipelines and how to ensure your pipelines run efficiently when your database is sharded.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical MongoDB Aggregations
Published in: Sep 2023Publisher: PacktISBN-13: 9781835080641
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Paul Done

Paul Done is a Field CTO at MongoDB Inc., having been a Solutions Architect for the past decade at MongoDB. He has previously held roles in various software disciplines, including engineering, consulting, and pre-sales, at companies like Oracle, Novell, and BEA Systems. Paul specializes in databases and middleware, focusing on resiliency, scalability, transactions, event processing, and applying evolvable data model approaches. He spent most of the early 2000s building Java EE (J2EE) transactional systems on WebLogic, integrated with relational databases like Oracle RAC and messaging systems like MQ Series.
Read more about Paul Done