Reader small image

You're reading from  Mastering MongoDB 7.0 - Fourth Edition

Product typeBook
Published inJan 2024
PublisherPackt
ISBN-139781835460474
Edition4th Edition
Concepts
Right arrow
Authors (7):
Marko Aleksendrić
Marko Aleksendrić
author image
Marko Aleksendrić

Marko Aleksendrić is an analyst, an ex-scientist, and a freelance self-taught web developer with over 20 years of experience. Marko has authored the book Modern Web Development with the FARM Stack, published by Packt Publishing. With a keen interest in backend and frontend development, he has been an avid MongoDB user for the last 15 years for various web and data analytics-related projects, with Python and JavaScript as his main tools.
Read more about Marko Aleksendrić

Arek Borucki
Arek Borucki
author image
Arek Borucki

Arek Borucki, a recognized MongoDB Champion and certified database administrator, has been working with MongoDB technology since 2016. As principal SRE database engineer, he works closely with technologies such as MongoDB, Elasticsearch, PostgreSQL, Kafka, Kubernetes, Terraform, AWS, and GCP. His extensive experience includes working with renowned companies such as Amadeus, Deutsche Bank, IBM, Nokia, and Beamery. Arek is also a Certified Kubernetes Administrator and developer, an active speaker at international conferences, and a co-author of questions for the MongoDB Associate DBA Exam.
Read more about Arek Borucki

Leandro Domingues
Leandro Domingues
author image
Leandro Domingues

Leandro Domingues is a MongoDB Community Champion and a Microsoft Data Platform MVP alumnus. Specializing in NoSQL databases, focusing on MongoDB, he has authored several articles and is also a speaker and organizer of events and conferences. In addition to teaching MongoDB, he was a professor at one of the largest universities in Brazil. Leandro is passionate about MongoDB and is a mentor and an inspiration to many developers and administrators. His efforts make MongoDB a more comprehensible tool for everyone.
Read more about Leandro Domingues

Malak Abu Hammad
Malak Abu Hammad
author image
Malak Abu Hammad

Malak Abu Hammad is a seasoned software engineering manager at Chain Reaction, with a decade of expertise in MongoDB. She has carved a niche for herself not only in MongoDB but also in essential web app technologies. Along with conducting various online and offline workshops, Malak is a MongoDB Champion and a founding member of the MongoDB Arabic Community. Her vision for MongoDB is a future with an emphasis on Arabic localization, aimed at bridging the gap between technology and regional dialects.
Read more about Malak Abu Hammad

Elie Hannouch
Elie Hannouch
author image
Elie Hannouch

Elie Hannouch is a senior software engineer and digital transformation expert. A driving force in the tech industry, he has a proven track record of delivering robust, scalable, and impactful solutions. As a start-up founder, Elie combines his extensive engineering background with strategic innovation to redefine how enterprises operate in today's digital age. Apart from being a MongoDB Champion, Elie leads the MongoDB, Google, and CNCF communities in Lebanon and works toward empowering aspiring tech professionals by demystifying complex concepts and inspiring a new generation of tech enthusiasts.
Read more about Elie Hannouch

Rajesh Nair
Rajesh Nair
author image
Rajesh Nair

Rajesh Nair is a software professional from Kerala, India, with over 12 years of experience working in various MNCs. He started his career as a database administrator for multiple RDBMS technologies, including Progress OpenEdge and MySQL. Rajesh also managed huge datasets for critical applications running on MongoDB as a MongoDB administrator for several years. He has worked on technologies such as MongoDB, AWS, Java, Kafka, MySQL, Progress OpenEdge, shell scripting, and Linux administration. Rajesh is currently based out of Amsterdam, Netherlands, working as a senior software engineer.
Read more about Rajesh Nair

Rachelle Palmer
Rachelle Palmer
author image
Rachelle Palmer

Rachelle Palmer is the Product Leader for Developer Database Experience and Developer Education at MongoDB, overseeing the driver client libraries, documentation, framework integrations, and MongoDB University. She has built sample applications for MongoDB in Java, PHP, Rust, Python, Node.js, and Ruby. Rachelle joined MongoDB in 2013 and was previously the director of the technical services engineering team, creating and managing the team that provided support and CloudOps to MongoDB Atlas.
Read more about Rachelle Palmer

View More author details
Right arrow

The MongoDB Architecture

MongoDB enables you to meet the demands of modern apps with a developer data platform built on several core architectural foundations. It lets you access the best ways to innovate in building transactional, operational, and analytical applications. This chapter examines the MongoDB architecture with a special emphasis on two key elements: replication and sharding.

Replication is a crucial component in MongoDB's distributed architecture, ensuring data accessibility and resilience to faults. It enables you to spread identical datasets across various database servers, safeguarding against the failure of a single server.

Additionally, you will learn about sharding, a horizontal scaling strategy for spreading data across several machines. As applications grow in popularity, and the volume of data they produce increases, scaling across machines becomes essential to ensure sufficient read and write throughput.

This chapter will cover the following topics...

Replication vs sharding

People often confuse replication with sharding. While both are sets of systems utilized in database management, they serve distinct purposes and are employed for different reasons. Replication is a process where data is duplicated and stored in multiple locations to ensure redundancy and reliability, playing a vital role in data protection and accessibility.

On the other hand, sharding involves dividing a larger database into smaller, more manageable parts, called shards. Each shard stores a portion of the total dataset on a separate database server instance. However, it's important to note that each shard must also implement replication to maintain data integrity and availability.

The goal of combining sharding with replication is to ensure data durability and high availability. When a shard's server instance fails and there's only a single copy of data on that shard, it can result in unavailability of data until the server is restored...

Replication

A replica set in MongoDB refers to a collection of mongod processes that uphold the same dataset. They offer redundancy and high availability, serving as the foundation for all production implementations. By having numerous data copies across various database servers, replication ensures a degree of fault tolerance, protecting against the failure of a single database server.

Figure 2.1: A replica set

The primary node handles all write operations, and logs all dataset changes in its operations log (oplog). A MongoDB replica set can only have one primary node.

The secondary nodes replicate the primary's oplog, and implement the operations on their own datasets, ensuring that they mirror the primary's dataset. In the event of the primary becoming inaccessible, a qualified secondary will initiate an election to become the new primary.

With data stored across multiple servers, replication increases the reliability of the system....

Sharding

MongoDB supports horizontal scaling through sharding. Sharding involves the distribution of data across numerous processes and plays an essential role in managing and organizing large-scale data. This method divides a larger database into smaller, more manageable components known as shards. Each shard is stored on a separate database server instance, which distributes the load and offers an effective approach to data management. Moreover, this technique allows for the creation of distributed databases to support geographically distributed applications, enabling policies that enforce data residency within specific regions.

Why do you need sharding?

Consider a scenario where your data is expanding swiftly and your database is approaching its maximum capacity. This circumstance could manifest a multitude of challenges. Typically, the most pressing problem is performance deterioration. As your database grows, the time required to query and retrieve data can increase significantly...

New sharded cluster features in MongoDB 7.0

MongoDB 7.0 continues to simplify the management and understanding of sharded clusters for both operations and developer use cases. This version provides additional insights that assist in making the best decisions for both initial and future shard key selections. Furthermore, starting in MongoDB 7.0, developers can experience a consistent interface when using these commands on sharded or unsharded clusters, while still retaining the option to optimize for performance when necessary. Let's have a look at the new sharding features introduced in MongoDB 7.0.

Shard key advisor commands

Choosing a shard key is complex due to intricate data patterns and trade-offs. Yet, the new features in MongoDB 7.0 aim to ease this task:

  • analyzeShardKey provides the ability to evaluate a candidate shard key against existing data. Shard key analysis in MongoDB 7.0 offers metrics to evaluate a shard key's suitability, including its...

Summary

Replication in MongoDB is a process that synchronizes data across multiple servers, providing redundancy and increased data availability. This is achieved through replica sets, a group of MongoDB servers that maintain the same dataset. Within a replica set, one node acts as the primary node, receiving all write operations, while the secondary nodes replicate the primary's operations to their datasets. This structure provides a robust system for failover and recovery. If a primary node fails, an election among the secondaries determines a new primary, allowing for continuous client operations.

Sharding in MongoDB is a method for splitting and distributing data across multiple servers or shards. Each shard is an independent replica set, and collectively, the shards make up a single logical database—the sharded cluster. This approach is used to support deployments with very large datasets and high-throughput operations, effectively addressing scalability issues...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering MongoDB 7.0 - Fourth Edition
Published in: Jan 2024Publisher: PacktISBN-13: 9781835460474
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (7)

author image
Marko Aleksendrić

Marko Aleksendrić is an analyst, an ex-scientist, and a freelance self-taught web developer with over 20 years of experience. Marko has authored the book Modern Web Development with the FARM Stack, published by Packt Publishing. With a keen interest in backend and frontend development, he has been an avid MongoDB user for the last 15 years for various web and data analytics-related projects, with Python and JavaScript as his main tools.
Read more about Marko Aleksendrić

author image
Arek Borucki

Arek Borucki, a recognized MongoDB Champion and certified database administrator, has been working with MongoDB technology since 2016. As principal SRE database engineer, he works closely with technologies such as MongoDB, Elasticsearch, PostgreSQL, Kafka, Kubernetes, Terraform, AWS, and GCP. His extensive experience includes working with renowned companies such as Amadeus, Deutsche Bank, IBM, Nokia, and Beamery. Arek is also a Certified Kubernetes Administrator and developer, an active speaker at international conferences, and a co-author of questions for the MongoDB Associate DBA Exam.
Read more about Arek Borucki

author image
Leandro Domingues

Leandro Domingues is a MongoDB Community Champion and a Microsoft Data Platform MVP alumnus. Specializing in NoSQL databases, focusing on MongoDB, he has authored several articles and is also a speaker and organizer of events and conferences. In addition to teaching MongoDB, he was a professor at one of the largest universities in Brazil. Leandro is passionate about MongoDB and is a mentor and an inspiration to many developers and administrators. His efforts make MongoDB a more comprehensible tool for everyone.
Read more about Leandro Domingues

author image
Malak Abu Hammad

Malak Abu Hammad is a seasoned software engineering manager at Chain Reaction, with a decade of expertise in MongoDB. She has carved a niche for herself not only in MongoDB but also in essential web app technologies. Along with conducting various online and offline workshops, Malak is a MongoDB Champion and a founding member of the MongoDB Arabic Community. Her vision for MongoDB is a future with an emphasis on Arabic localization, aimed at bridging the gap between technology and regional dialects.
Read more about Malak Abu Hammad

author image
Elie Hannouch

Elie Hannouch is a senior software engineer and digital transformation expert. A driving force in the tech industry, he has a proven track record of delivering robust, scalable, and impactful solutions. As a start-up founder, Elie combines his extensive engineering background with strategic innovation to redefine how enterprises operate in today's digital age. Apart from being a MongoDB Champion, Elie leads the MongoDB, Google, and CNCF communities in Lebanon and works toward empowering aspiring tech professionals by demystifying complex concepts and inspiring a new generation of tech enthusiasts.
Read more about Elie Hannouch

author image
Rajesh Nair

Rajesh Nair is a software professional from Kerala, India, with over 12 years of experience working in various MNCs. He started his career as a database administrator for multiple RDBMS technologies, including Progress OpenEdge and MySQL. Rajesh also managed huge datasets for critical applications running on MongoDB as a MongoDB administrator for several years. He has worked on technologies such as MongoDB, AWS, Java, Kafka, MySQL, Progress OpenEdge, shell scripting, and Linux administration. Rajesh is currently based out of Amsterdam, Netherlands, working as a senior software engineer.
Read more about Rajesh Nair

author image
Rachelle Palmer

Rachelle Palmer is the Product Leader for Developer Database Experience and Developer Education at MongoDB, overseeing the driver client libraries, documentation, framework integrations, and MongoDB University. She has built sample applications for MongoDB in Java, PHP, Rust, Python, Node.js, and Ruby. Rachelle joined MongoDB in 2013 and was previously the director of the technical services engineering team, creating and managing the team that provided support and CloudOps to MongoDB Atlas.
Read more about Rachelle Palmer