Reader small image

You're reading from  Mastering MongoDB 7.0 - Fourth Edition

Product typeBook
Published inFeb 2024
PublisherPackt
ISBN-139781835883501
Edition4th Edition
Concepts
Right arrow
Authors (7):
Marko Aleksendrić
Marko Aleksendrić
author image
Marko Aleksendrić

Marko Aleksendrić is an analyst, an ex-scientist, and a freelance self-taught web developer with over 20 years of experience. Marko has authored the book Modern Web Development with the FARM Stack, published by Packt Publishing. With a keen interest in backend and frontend development, he has been an avid MongoDB user for the last 15 years for various web and data analytics-related projects, with Python and JavaScript as his main tools.
Read more about Marko Aleksendrić

Arek Borucki
Arek Borucki
author image
Arek Borucki

Arek Borucki, a recognized MongoDB Champion and certified database administrator, has been working with MongoDB technology since 2016. As principal SRE database engineer, he works closely with technologies such as MongoDB, Elasticsearch, PostgreSQL, Kafka, Kubernetes, Terraform, AWS, and GCP. His extensive experience includes working with renowned companies such as Amadeus, Deutsche Bank, IBM, Nokia, and Beamery. Arek is also a Certified Kubernetes Administrator and developer, an active speaker at international conferences, and a co-author of questions for the MongoDB Associate DBA Exam.
Read more about Arek Borucki

Leandro Domingues
Leandro Domingues
author image
Leandro Domingues

Leandro Domingues is a MongoDB Community Champion and a Microsoft Data Platform MVP alumnus. Specializing in NoSQL databases, focusing on MongoDB, he has authored several articles and is also a speaker and organizer of events and conferences. In addition to teaching MongoDB, he was a professor at one of the largest universities in Brazil. Leandro is passionate about MongoDB and is a mentor and an inspiration to many developers and administrators. His efforts make MongoDB a more comprehensible tool for everyone.
Read more about Leandro Domingues

Malak Abu Hammad
Malak Abu Hammad
author image
Malak Abu Hammad

Malak Abu Hammad is a seasoned software engineering manager at Chain Reaction, with a decade of expertise in MongoDB. She has carved a niche for herself not only in MongoDB but also in essential web app technologies. Along with conducting various online and offline workshops, Malak is a MongoDB Champion and a founding member of the MongoDB Arabic Community. Her vision for MongoDB is a future with an emphasis on Arabic localization, aimed at bridging the gap between technology and regional dialects.
Read more about Malak Abu Hammad

Elie Hannouch
Elie Hannouch
author image
Elie Hannouch

Elie Hannouch is a senior software engineer and digital transformation expert. A driving force in the tech industry, he has a proven track record of delivering robust, scalable, and impactful solutions. As a start-up founder, Elie combines his extensive engineering background with strategic innovation to redefine how enterprises operate in today's digital age. Apart from being a MongoDB Champion, Elie leads the MongoDB, Google, and CNCF communities in Lebanon and works toward empowering aspiring tech professionals by demystifying complex concepts and inspiring a new generation of tech enthusiasts.
Read more about Elie Hannouch

Rajesh Nair
Rajesh Nair
author image
Rajesh Nair

Rajesh Nair is a software professional from Kerala, India, with over 12 years of experience working in various MNCs. He started his career as a database administrator for multiple RDBMS technologies, including Progress OpenEdge and MySQL. Rajesh also managed huge datasets for critical applications running on MongoDB as a MongoDB administrator for several years. He has worked on technologies such as MongoDB, AWS, Java, Kafka, MySQL, Progress OpenEdge, shell scripting, and Linux administration. Rajesh is currently based out of Amsterdam, Netherlands, working as a senior software engineer.
Read more about Rajesh Nair

Rachelle Palmer
Rachelle Palmer
author image
Rachelle Palmer

Rachelle Palmer is the Product Leader for Developer Database Experience and Developer Education at MongoDB, overseeing the driver client libraries, documentation, framework integrations, and MongoDB University. She has built sample applications for MongoDB in Java, PHP, Rust, Python, Node.js, and Ruby. Rachelle joined MongoDB in 2013 and was previously the director of the technical services engineering team, creating and managing the team that provided support and CloudOps to MongoDB Atlas.
Read more about Rachelle Palmer

View More author details
Right arrow

Introduction to MongoDB

MongoDB, the most popular document database, is a NoSQL, non-relational key-value store, a JSON database, and more. It is a robust, feature-rich developer data platform with various built-in features that you need for modern applications, such as machine learning and AI capabilities, streaming, functions, triggers, serverless, device sync, and full-text search.

Though MongoDB is non-relational, it can easily handle relational data. It offers courses, tutorials, and documentation at learn.mongodb.com on how to best model and handle that data, even while using the document format.

Who uses MongoDB

10 years ago, the use of MongoDB was somewhat niche. It was a young database with compelling features for developers like you. Now, in 2023, MongoDB is used by a myriad of varied industries, and its use cases span across all kinds of situations and types of data stored. Some of the largest banks, automakers, government agencies, and gaming companies in the world use MongoDB for their production applications. The most famous users of MongoDB are Coinbase, Epic Games, Morgan Stanley, Adobe, Tesla, Canva, Ulta Beauty, Cathay Pacific, Dongwha, and Vodafone.

The MongoDB Atlas platform has millions of users, all of whom trust that their data will be managed safely and effectively in the cloud. This popularity has taken MongoDB to great heights, not just in terms of its growth and value as a company but also in terms of its raw developer mindshare. As of 2023, one in four developers uses MongoDB extensively in production. This ratio is much larger in other developer communities such as Go and JavaScript (approximately 40%).

Why developers love MongoDB

Along with its versatality, and robust features, MongoDB is the preferred choice for several reasons, such as:

  • Flexibility and schema-less: Unlike traditional relational databases, MongoDB allows you to store and retrieve data without strict schemas or predefined structures. This flexibility is particularly useful when data evolves over time or when you are dealing with unstructured or semi-structured data.
  • Scalability and performance: MongoDB is highly scalable and performs exceptionally well, making it suitable for both large-scale applications and personal projects. MongoDB Atlas provides a free-forever tier for side projects.
  • Rich query language: MongoDB offers a powerful query language and indexing capabilities, simplifying common operations such as findOne and updateOne.
  • Developer-friendly data format: Data in MongoDB closely resembles objects in popular programming languages, reducing data mapping complexities and expediting development.
  • Simplicity and quick start: MongoDB's simplicity and hassle-free setup makes it easy to adapt. No complex sales processes or licensing hassles are involved.

What attracts most developers is the simplicity of working with MongoDB on a daily basis, in particular, the seamless experience of creating, updating, and interacting with data. For example, consider a Python developer attempting to insert a document, query that document, and receive a set of results, using the following code:

from pymongo import MongoClient
# Connect to MongoDB
client = MongoClient('mongodb://localhost:27017/')
db = client['mydatabase']  # Specify the database name
collection = db['mycollection']  # Specify the collection name
# Create a document to be inserted
document = {
    'name': 'Jane Doe',
    'age': 30,
    'email': 'janedoe@example.com'
}
# Insert the document into the collection
result = collection.insert_one(document)
# Check if the insertion was successful
if result.acknowledged:
    print('Document inserted successfully.')
    print('Inserted document ID:', result.inserted_id)
else:
    print('Failed to insert document.')

Note that the developer creates a dictionary representing the document to be inserted. In this case, it contains the name, age, and email details. The developer doesn't need to create an ID for the document, because MongoDB automatically creates a unique identifier on each document.

To retrieve this document, you can filter the query by using any of the document's field individually, or in combination. Let's see that in action:

from pymongo import MongoClient
# Connect to MongoDB
client = MongoClient('mongodb://localhost:27017/')
db = client['mydatabase']  # Specify the database name
collection = db['mycollection']  # Specify the collection name
# Retrieve documents based on specific conditions
query = {
    'age': {'$gte': 29},  # Retrieve documents where age is greater than or equal to 29
}
documents = collection.find(query)
# Iterate over the retrieved documents
for document in documents:
    print(document)

Pretty simple! The preceding example demonstrates how you can use a MongoDB query operator such as $gte (greater than or equal to) to filter your query. But the real magic happens when the document is returned. When MongoDB returns a document, it will be represented as a Python dictionary. Each field in the document is a key-value pair within the dictionary, similar to the following example:

{
    '_id': ObjectId('60f5c4c4543b5a2c7c4c73a2'),
    'name': 'Jane Doe',
    'age': 30,
    'email': 'janedoe@example.com'
}

MongoDB has a suite of language libraries and drivers that act as a translation layer between the client and server, intercepting each operation and translating it into MongoDB's query language. With this, you can interact with the data using your native programming language in a purely idiomatic way.

Alongside the other offerings of the MongoDB Atlas developer data platform, it truly abstracts away the difficulties of working with a database, and instead allows you to interact with data purely via your code and IDE. This is infinitely preferable while using a separate database shell, database UI, and other database-specific tools. Since MongoDB Atlas offers a completely managed MongoDB database, you can set up and register via a command-line interface.

At its heart, the mission of MongoDB is to be a powerful database for developers, and its features are tuned to the programming language communities and framework integrations, rather than to database administration tools. This will become more apparent in the subsequent chapters, where you'll learn more about MongoDB Atlas, Atlas Vector Search, full-text search, and features such as aggregation—all through the lens of a developer.

By the end of this book, you'll learn how effective these tools can be and make database management simpler!

Efficiency of the inherent complexity of MongoDB databases

The most interesting part of the modern database is understanding its architecture and why it's built that way. Fundamentally, MongoDB is a distributed system. The database server itself was originally built with the anticipation that most users would run it with a default configuration—replica set, sometimes also referred to as a cluster. When you explore this architecture in-depth, you'll notice the true complexities.

By default, replica set of MongoDB is a three-node configuration. All three nodes are data-bearing, which means that there is a complete copy of the database available on each node. Each database is hosted on a separate instance or host, which can be in the same availability zone, data center, or region. This default configuration is to ensure both redundancy and high availability. Chapter 2, The MongoDB Architecture will discuss replica sets in more detail.

If one of the instances becomes unresponsive or unavailable, a healthy node is promoted to become the primary node. This failover between members occurs automatically, and there's no impact on operations for the users of the database. This process considers many different factors, including node availability, data freshness, and responsiveness. This election process and protocol, while simple to understand at a high-level, is very nuanced. But since the operations continue without interruption, you hardly know or understand these details.

How is this possible?

Behind the scenes, write operations to MongoDB are propagated from the primary node to the secondary nodes via a process called replication. The best way to explain replication is with the example of a single write to the database. An inbound write from the client application (your app) will be first directed to the primary node. That primary node will apply the write to its copy of the database. Then, the write is recorded in the operations log (oplog), which is tailed by secondary nodes.

Replication in MongoDB is based on the RAFT consensus protocol. One particular example of how this implementation varies is leader elections. In the traditional RAFT protocol, leader and primary node election occurs through a combination of randomized election timeouts and message exchanges. In MongoDB, there are settings for node priority. This priority is considered along with data freshness and response time when electing a primary node.

It is often true that the write operation is not written simultaneously to all nodes—there is a lag heavily influenced by factors such as network latency, the distance between nodes, hardware configuration, and workload. If one of the mongod nodes falls behind, it will catch up or resync itself when it is able to do so using the oplog to determine the gaps in its operations. The MongoDB system monitors the replication lag between nodes to track this metric and assess whether the delay between primary and secondary nodes is acceptable, and if not, takes necessary action. This process is unique among databases as well.

This default configuration of MongoDB is a replica set with three members, where replication of data between nodes and failover between nodes are all handled automatically. This configuration is both durable and highly available, which makes it easy to use. For developers who require larger, global deployments, MongoDB has a sharded cluster model. The first thing to understand is that a sharded cluster consists of replica sets. It is a way of further dividing your data into effectively replicated partitions.

Figure 1.1: Replicated partitions set with primary and secondary nodes

If you require a global deployment with multiple terabytes of data, get started with Chapter 2, The MongoDB Architecture. It will cover how to split data, how to migrate data between regions or shards, how to marry data from multiple regions for analytics, and the performance of sharded cluster architectures.

Summary

MongoDB is a simple yet powerful database. It abstracts away many of the more complicated implementation details so that you can focus on building applications. It's easy to get started with and offers a powerful, idiomatic developer experience that allows you to interact with the database, exclusively in the programming language of your choice, via your IDE.

The rest of the book details how powerful and flexible MongoDB is, the new features in MongoDB 7.0, and how you can use it to your advantage. Besides being a great database for web applications, transactional data, flexible schemas, and high-performance workloads, it is also a great database for learning through hands-on experience and building proofs of concept.

In the next chapter, you'll see how replication and sharding can help increase reliability and availability for your applications.

You have been reading a chapter from
Mastering MongoDB 7.0 - Fourth Edition
Published in: Feb 2024Publisher: PacktISBN-13: 9781835883501
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (7)

author image
Marko Aleksendrić

Marko Aleksendrić is an analyst, an ex-scientist, and a freelance self-taught web developer with over 20 years of experience. Marko has authored the book Modern Web Development with the FARM Stack, published by Packt Publishing. With a keen interest in backend and frontend development, he has been an avid MongoDB user for the last 15 years for various web and data analytics-related projects, with Python and JavaScript as his main tools.
Read more about Marko Aleksendrić

author image
Arek Borucki

Arek Borucki, a recognized MongoDB Champion and certified database administrator, has been working with MongoDB technology since 2016. As principal SRE database engineer, he works closely with technologies such as MongoDB, Elasticsearch, PostgreSQL, Kafka, Kubernetes, Terraform, AWS, and GCP. His extensive experience includes working with renowned companies such as Amadeus, Deutsche Bank, IBM, Nokia, and Beamery. Arek is also a Certified Kubernetes Administrator and developer, an active speaker at international conferences, and a co-author of questions for the MongoDB Associate DBA Exam.
Read more about Arek Borucki

author image
Leandro Domingues

Leandro Domingues is a MongoDB Community Champion and a Microsoft Data Platform MVP alumnus. Specializing in NoSQL databases, focusing on MongoDB, he has authored several articles and is also a speaker and organizer of events and conferences. In addition to teaching MongoDB, he was a professor at one of the largest universities in Brazil. Leandro is passionate about MongoDB and is a mentor and an inspiration to many developers and administrators. His efforts make MongoDB a more comprehensible tool for everyone.
Read more about Leandro Domingues

author image
Malak Abu Hammad

Malak Abu Hammad is a seasoned software engineering manager at Chain Reaction, with a decade of expertise in MongoDB. She has carved a niche for herself not only in MongoDB but also in essential web app technologies. Along with conducting various online and offline workshops, Malak is a MongoDB Champion and a founding member of the MongoDB Arabic Community. Her vision for MongoDB is a future with an emphasis on Arabic localization, aimed at bridging the gap between technology and regional dialects.
Read more about Malak Abu Hammad

author image
Elie Hannouch

Elie Hannouch is a senior software engineer and digital transformation expert. A driving force in the tech industry, he has a proven track record of delivering robust, scalable, and impactful solutions. As a start-up founder, Elie combines his extensive engineering background with strategic innovation to redefine how enterprises operate in today's digital age. Apart from being a MongoDB Champion, Elie leads the MongoDB, Google, and CNCF communities in Lebanon and works toward empowering aspiring tech professionals by demystifying complex concepts and inspiring a new generation of tech enthusiasts.
Read more about Elie Hannouch

author image
Rajesh Nair

Rajesh Nair is a software professional from Kerala, India, with over 12 years of experience working in various MNCs. He started his career as a database administrator for multiple RDBMS technologies, including Progress OpenEdge and MySQL. Rajesh also managed huge datasets for critical applications running on MongoDB as a MongoDB administrator for several years. He has worked on technologies such as MongoDB, AWS, Java, Kafka, MySQL, Progress OpenEdge, shell scripting, and Linux administration. Rajesh is currently based out of Amsterdam, Netherlands, working as a senior software engineer.
Read more about Rajesh Nair

author image
Rachelle Palmer

Rachelle Palmer is the Product Leader for Developer Database Experience and Developer Education at MongoDB, overseeing the driver client libraries, documentation, framework integrations, and MongoDB University. She has built sample applications for MongoDB in Java, PHP, Rust, Python, Node.js, and Ruby. Rachelle joined MongoDB in 2013 and was previously the director of the technical services engineering team, creating and managing the team that provided support and CloudOps to MongoDB Atlas.
Read more about Rachelle Palmer