Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mastering MongoDB 7.0 - Fourth Edition

You're reading from  Mastering MongoDB 7.0 - Fourth Edition

Product type Book
Published in Feb 2024
Publisher Packt
ISBN-13 9781835883501
Pages 398 pages
Edition 4th Edition
Languages
Concepts
Authors (7):
Marko Aleksendrić Marko Aleksendrić
Profile icon Marko Aleksendrić
Arek Borucki Arek Borucki
Profile icon Arek Borucki
Leandro Domingues Leandro Domingues
Profile icon Leandro Domingues
Malak Abu Hammad Malak Abu Hammad
Profile icon Malak Abu Hammad
Elie Hannouch Elie Hannouch
Profile icon Elie Hannouch
Rajesh Nair Rajesh Nair
Profile icon Rajesh Nair
Rachelle Palmer Rachelle Palmer
Profile icon Rachelle Palmer
View More author details

Table of Contents (20) Chapters

Preface 1. Chapter 1: Introduction to MongoDB 2. Chapter 2: The MongoDB Architecture 3. Chapter 3: Developer Tools 4. Chapter 4: Connecting to MongoDB 5. Chapter 5: CRUD Operations and Basic Queries 6. Chapter 6: Schema Design and Data Modeling 7. Chapter 7: Advanced Querying in MongoDB 8. Chapter 8: Aggregation 9. Chapter 9: Multi-Document ACID Transactions 10. Chapter 10: Index Optimization 11. Chapter 11: MongoDB Atlas: Powering the Future of Developer Data Platforms 12. Chapter 12: Monitoring and Backup in MongoDB 13. Chapter 13: Introduction to Atlas Search 14. Chapter 14: Integrating Applications with MongoDB 15. Chapter 15: Security 16. Chapter 16: Auditing 17. Chapter 17: Encryption 18. Index 19. Other Books You May Enjoy

Efficiency of the inherent complexity of MongoDB databases

The most interesting part of the modern database is understanding its architecture and why it's built that way. Fundamentally, MongoDB is a distributed system. The database server itself was originally built with the anticipation that most users would run it with a default configuration—replica set, sometimes also referred to as a cluster. When you explore this architecture in-depth, you'll notice the true complexities.

By default, replica set of MongoDB is a three-node configuration. All three nodes are data-bearing, which means that there is a complete copy of the database available on each node. Each database is hosted on a separate instance or host, which can be in the same availability zone, data center, or region. This default configuration is to ensure both redundancy and high availability. Chapter 2, The MongoDB Architecture will discuss replica sets in more detail.

If one of the instances becomes unresponsive or unavailable, a healthy node is promoted to become the primary node. This failover between members occurs automatically, and there's no impact on operations for the users of the database. This process considers many different factors, including node availability, data freshness, and responsiveness. This election process and protocol, while simple to understand at a high-level, is very nuanced. But since the operations continue without interruption, you hardly know or understand these details.

How is this possible?

Behind the scenes, write operations to MongoDB are propagated from the primary node to the secondary nodes via a process called replication. The best way to explain replication is with the example of a single write to the database. An inbound write from the client application (your app) will be first directed to the primary node. That primary node will apply the write to its copy of the database. Then, the write is recorded in the operations log (oplog), which is tailed by secondary nodes.

Replication in MongoDB is based on the RAFT consensus protocol. One particular example of how this implementation varies is leader elections. In the traditional RAFT protocol, leader and primary node election occurs through a combination of randomized election timeouts and message exchanges. In MongoDB, there are settings for node priority. This priority is considered along with data freshness and response time when electing a primary node.

It is often true that the write operation is not written simultaneously to all nodes—there is a lag heavily influenced by factors such as network latency, the distance between nodes, hardware configuration, and workload. If one of the mongod nodes falls behind, it will catch up or resync itself when it is able to do so using the oplog to determine the gaps in its operations. The MongoDB system monitors the replication lag between nodes to track this metric and assess whether the delay between primary and secondary nodes is acceptable, and if not, takes necessary action. This process is unique among databases as well.

This default configuration of MongoDB is a replica set with three members, where replication of data between nodes and failover between nodes are all handled automatically. This configuration is both durable and highly available, which makes it easy to use. For developers who require larger, global deployments, MongoDB has a sharded cluster model. The first thing to understand is that a sharded cluster consists of replica sets. It is a way of further dividing your data into effectively replicated partitions.

Figure 1.1: Replicated partitions set with primary and secondary nodes

If you require a global deployment with multiple terabytes of data, get started with Chapter 2, The MongoDB Architecture. It will cover how to split data, how to migrate data between regions or shards, how to marry data from multiple regions for analytics, and the performance of sharded cluster architectures.

You have been reading a chapter from
Mastering MongoDB 7.0 - Fourth Edition
Published in: Feb 2024 Publisher: Packt ISBN-13: 9781835883501
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}