Reader small image

You're reading from  Practical MongoDB Aggregations

Product typeBook
Published inMar 2024
PublisherPackt
ISBN-139781835884362
Edition1st Edition
Tools
Right arrow
Author (1)
Paul Done
Paul Done
author image
Paul Done

Paul Done is a Field CTO at MongoDB Inc., having been a Solutions Architect for the past decade at MongoDB. He has previously held roles in various software disciplines, including engineering, consulting, and pre-sales, at companies like Oracle, Novell, and BEA Systems. Paul specializes in databases and middleware, focusing on resiliency, scalability, transactions, event processing, and applying evolvable data model approaches. He spent most of the early 2000s building Java EE (J2EE) transactional systems on WebLogic, integrated with relational databases like Oracle RAC and messaging systems like MQ Series.
Read more about Paul Done

Right arrow

Foundational Examples: Filtering, Grouping, and Unwinding

This chapter provides examples of common data manipulation patterns used in many aggregation pipelines, which are relatively straightforward to understand and adapt. By getting baseline knowledge with these foundational examples, you will be well positioned to tackle the more advanced examples later in this book.

This chapter will cover the following:

  • Finding the most recent subset of data
  • Grouping and summarizing data
  • Unwinding arrays and grouping them differently
  • Capturing a list of unique values

Filtered top subset

First, you will look at an example that demonstrates how to query a sorted subset of data. As with all subsequent examples, this first example provides the commands you need to populate the dataset in your own MongoDB database and then apply the aggregation pipeline to produce the results shown.

Scenario

You need to query a collection of people to find the three youngest individuals who have a job in engineering, sorted by the youngest person first.

Note

This example is the only one in the book that you can also achieve entirely using the MongoDB Query Language and serves as a helpful comparison between the MongoDB Query Language and aggregation pipelines.

Populating the sample data

To start with, drop any old version of the database (if it exists) and then populate a new persons collection with six person documents. Each person record will contain the person's ID, first name, last name, date of birth, vocation, and address:

db = db.getSiblingDB...

Group and total

This next section provides an example of the most commonly used pattern for grouping and summarizing data from a collection.

Scenario

You need to generate a report to show what each shop customer purchased in 2020. You will group the individual order records by customer, capturing each customer's first purchase date, the number of orders they made, the total value of all their orders, and a list of their order items sorted by date.

Populating the sample data

To start with, drop any old version of the database (if it exists) and then populate a new orders collection with nine order documents spanning 2019-2021, for three different unique customers. Each order record will contain a customer ID, the date of the order, and the dollar total for the order:

db = db.getSiblingDB("book-group-and-total");
db.dropDatabase();
// Create index for an orders collection
db.orders.createIndex({"orderdate": -1});
// Insert records into the orders...

Unpack arrays and group differently

You applied filters and groups to whole documents in the previous two examples. In this example, you will work with an array field contained in each document, unraveling each array's contents to enable you to subsequently group the resulting raw data in a way that helps you produce a critical business summary report.

Scenario

You want to generate a retail report to list the total value and quantity of expensive products sold (valued over 15 dollars). The source data is a list of shop orders, where each order contains the set of products purchased as part of the order.

Populating the sample data

Drop any old version of the database (if it exists) and then populate a new orders collection where each document contains an array of products purchased. Each order document contains an order ID plus a list of products purchased as part of the order, including each product's ID, name, and price:

db = db.getSiblingDB("book-unpack...

Distinct list of values

A common requirement in the user interface of an application is to provide a drop-down picklist of possible values in an input field of a form ready for the application user to select one of the values. Here, you will learn how to populate a list of unique values ready for use in a drop-down widget.

Scenario

You want to query a collection of people where each document contains data on one or more languages spoken by the person. The query result should be an alphabetically sorted list of unique languages that a developer can subsequently use to populate a list of values in a user interface's drop-down widget.

This example is the equivalent of a SELECT DISTINCT statement in SQL.

Populating the sample data

Drop any old versions of the database (if they exist) and then populate a new persons collection. Each person document includes the person's first name, last name, vocation, and spoken languages:

db = db.getSiblingDB("book-distinct...

Summary

You started this chapter by looking at a simple example for extracting a sorted top subset of data from a collection. Then, you moved on to applying strategies for grouping and summarizing data, followed by how to unpack arrays of values to group differently. Finally, you finished with an aggregation to generate a list of unique values from a collection.

In the next chapter, you'll see how to use various approaches to join data from two different collections.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical MongoDB Aggregations
Published in: Mar 2024Publisher: PacktISBN-13: 9781835884362
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Paul Done

Paul Done is a Field CTO at MongoDB Inc., having been a Solutions Architect for the past decade at MongoDB. He has previously held roles in various software disciplines, including engineering, consulting, and pre-sales, at companies like Oracle, Novell, and BEA Systems. Paul specializes in databases and middleware, focusing on resiliency, scalability, transactions, event processing, and applying evolvable data model approaches. He spent most of the early 2000s building Java EE (J2EE) transactional systems on WebLogic, integrated with relational databases like Oracle RAC and messaging systems like MQ Series.
Read more about Paul Done