Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
DynamoDB Applied Design Patterns

You're reading from  DynamoDB Applied Design Patterns

Product type Book
Published in Sep 2014
Publisher
ISBN-13 9781783551897
Pages 202 pages
Edition 1st Edition
Languages
Author (1):
Uchit Hamendra Vyas Uchit Hamendra Vyas

Chapter 5. Query and Scan Operations in DynamoDB

In the previous chapter, we learned to create a secondary index for a table and its role in retrieving the items efficiently. In the long run, knowledge of the secondary index is useful only if we know how to use it for retrieval. Item retrieval can be done in DynamoDB using two operations called query and scan. Similarly we also discussed sharding. In this chapter, we will learn about parallel scanning, which makes use of the sharding concept. The primary objective of any database (whether it be NoSQL or SQL) is to provide easy storage and faster retrieval of data. So far, we have discussed various configurations that can be added to our table, such as adding an index, specifying the primary key, and so on. In this chapter, we will cover the following topics:

  • Querying table items

  • Scanning table items

  • Parallel scanning

First, we will discuss the query operation, which makes use of the hash and range key values to retrieve the items. Then we will...

Querying tables


One of the most efficient ways of retrieving data from a DynamoDB table is by using the query operation on the table. One of the mandatory parameters or conditions to be provided while performing a query operation is performing a comparison operation on the primary key attribute value. The query operation supports the following comparison operations, namely:

  • EQ: This stands for equal to

  • LE: This stands for less than or equal to

  • LT: This stands for less than

  • GE: This stands for greater than or equal to

  • GT: This stands for greater than

  • BETWEEN: This retrieves items whose primary key value is between the specified values

  • BEGINS_WITH: This retrieves items whose primary key begins with the specified value

These seven comparison operations can be performed directly on primary key values, which will retrieve only the necessary items (without even bothering the partitions/items that don't have this value). There are six more comparison operations that can be performed on the items...

Scanning tables


A scan operation evaluates each and every item in the table. Usually, it retrieves every item (with all the attributes along with all the items) of the table. This is the reason why the scan operation is not preferred. It is always recommended that you use query whenever possible. However, it is possible for us to retrieve only specific attributes using the AttributesToGet parameter, similar to the way we saw with query. Additionally, we can filter the number of items retrieved by the scan using the scan filter condition. For instance, if we assume that there are 100 items available in the table, and if the scan filter filters out 10 items using strong consistent read (which consumes a maximum of 1 KB capacity units per item), can you tell how many capacity units were eaten up by this scan operation? If you think it consumes 100 capacity units, then you're in the right boat, because the capacity unit is not a measure of how many items (hoping that every item is less than...

Parallel scanning


As we discussed in DynamoDB sharding, the table data is partitioned based on the hash key value. Even though this sharding will smoothen the read and write operations, it doesn't help us to scan the partitions in parallel. For example, if the table data is available in five partitions (each partition has a throughput capacity of five units), then even if the table could provision more than five capacity units, it cannot do so. The maximum throughput capacity of the table cannot exceed the fastest (having high throughput) partition. So based on these facts, what we infer is:

  • A scan operation will return maximum 1 MB of data at a time

  • Scan operations can read data from only one partition at a time

  • For a larger table, no matter how large the throughput is, a sequential scan will always take too much time

  • The scanning speed can never be faster than the fastest partition (having high throughput)

To put it simply, even if our television has one hundred channels, we will be able to...

Summary


In this chapter, we learned to perform simple query and scan operations on the DynamoDB table and its secondary indexes. Finally, we have also seen parallel scanning, which is good for growing and high-priority tables.

Web services and REST API are becoming more and more advanced with every passing day, mainly because of their platform-independent language. So in the next chapter, we will learn the basics of REST and how to effectively perform DynamoDB operations using REST.

lock icon The rest of the chapter is locked
You have been reading a chapter from
DynamoDB Applied Design Patterns
Published in: Sep 2014 Publisher: ISBN-13: 9781783551897
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}