Packt+ | Advance your knowledge in tech

You're reading from Mastering DynamoDB

Product type Book

Published in Aug 2014

Publisher Packt

ISBN-13 9781783551958

Pages 236 pages

Edition 1st Edition

Languages

Concepts

Databases

Author (1):

Tanmay Deshpande

Table of Contents (18) Chapters

Mastering DynamoDB

Credits

Foreword

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Preface

Getting Started

Data Models

How DynamoDB Works

Best Practices

Advanced Topics

Integrating DynamoDB with Other AWS Components

DynamoDB – Use Cases

Useful Libraries and Tools

Developing Mobile Apps Using DynamoDB

Index

Chapter 3. How DynamoDB Works

In the previous chapter, we saw the features DynamoDB has, and learned how to perform various operations on DynamoDB using a variety of APIs. We also saw various application-oriented examples, and what features of DynamoDB fit well in what conditions. Now it's time to understand its internals. In this chapter, we are going to talk about why DynamoDB was developed. What is the architecture underneath that makes it so robust and scalable? How does DynamoDB handle failures? So, let's get started.

In Chapter 1, Getting Started, we discussed DynamoDB's history; DynamoDB was built to address the scaling needs of Amazon's worldwide e-commerce platform, and also provide high availability, reliability, and performance. Amazon's platform is highly decoupled, consisting of thousands of services running on storage machines. Amazon needs reliable storage systems that can store and retrieve data even in conditions like disk failures, network failures, or even during natural...

Service-oriented architecture

Amazon's platform is fully service oriented, which means various components in Amazon's ecosystem are exposed as a service for the other services to consume. Each service has to maintain its SLA in order to complete the response accurately and on time. SLA is a Service Level Agreement where a service agrees to provide a response within the said time for a said number of requests per second. In Amazon's service-oriented architecture, it is very important for services to maintain the agreement, as Amazon's request response engine builds the response dynamically by combining the results from many services. Even to answer a single request on Amazon's e-commerce website, hundreds of services come together to form the response.

In the following diagram, the request comes from the web via Amazon's e-commerce platform, which is then forwarded to various services to get the appropriate data from DynamoDB. Data transfer happens in DynamoDB from simple APIs like GET/PUT...

Design features

While deciding DynamoDB's architecture, several design considerations have been made that were quite new at that time, and over time these techniques have become so popular that it has inspired many NoSQL databases we are using these days. The following are a few such features:

Data replication
Conflict resolution
Scalability
Symmetry
Flexibility

Data replication

While deciding upon the data replication strategy, Amazon engineers put significant focus on achieving high availability and reliability. Traditional data replication techniques used to have synchronous replica update, which means that if the value for a certain attribute gets changed, it would be updated with all its replicas at that point of time only, and unless that is done, access to that attribute would be made unavailable. This technique was used in order to avoid wrong or stale data being provided to the user. But this technique was not that efficient, as networks and disks are bound to fail and waiting for all...

Architecture

DynamoDB's architecture consists of various well-known, and a few new, techniques that have helped engineers build such a great system. To build a robust system like this, one has to consider various things, such as load balancing, partitioning, failure detection/prevention/recovery, replica management and their sync, and so on. In this section, we are going to focus on all these things, and learn about them in detail.

Load balancing

DynamoDB, being a distributed system, needs its data to be balanced across various nodes. It uses consistent hashing for distributing data across the nodes. Consistent hashing dynamically partitions the data over the network and keeps the system balanced.

Consistent hashing is a classic solution to a very complex problem. The secret is finding a node in a distributed cluster to store and retrieve a value identified by a key, while at the same time being able to handle the node failures. You would say this is quite easy, as you can simply number the...

Functional components

Till now, we have seen how DynamoDB's architecture provides so many features in terms of scalability, fault tolerance, availability, and so on. We also saw how ring membership is maintained and how it helps DynamoDB's desired specialities.

Each DynamoDB node consists of the following components:

Request coordinator
Membership and failure detection
Local persistent store (storage engine)

Request coordinator

The request coordinator is an event-driven messaging component that works like Staged-Even Drive Architecture (SEDA). Here, we break the complex event into multiple stages. This decouples event and thread scheduling from application logic. You can read more about SEDA at www.eecs.harvard.edu/~mdw/proj/seda/. This component is mainly responsible for handling any client requests coming its way.

Suppose a coordinator receives a get request, then it asks for the data from the respective nodes where the key range lies. It waits till it gets the acceptable number of responses...

Summary

In this chapter, we have seen the design specifications of DynamoDB, various techniques like a quorum approach, gossip protocols, ring membership, and Merkle trees, and their implementation and benefits in developing such a brilliant system. We can see the efforts put in by the Amazon engineering team and their focus on each and every minute detail of architecture and its execution.

As I had said earlier, DynamoDB was the real inspiration for many NoSQL databases, such as Riak and Cassandra. Now, that you have understood the architectural details of DynamoDB, you can check out the architecture of previously mentioned databases and see the similarities and differences.

This chapter was an effort to simplify the information given by Amazon in its white paper, but as time flies, there would have been many changes in its architecture, and to know more about it, we would have to wait to hear it from Amazon.

I am sure if you are a real technology fan, then after reading this chapter, you...