Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
DynamoDB Applied Design Patterns

You're reading from  DynamoDB Applied Design Patterns

Product type Book
Published in Sep 2014
Publisher
ISBN-13 9781783551897
Pages 202 pages
Edition 1st Edition
Languages
Author (1):
Uchit Hamendra Vyas Uchit Hamendra Vyas

Chapter 4. Working with Secondary Indexes

In the previous chapter, we saw how to work with DynamoDB SDK. We discussed table creation, item insertion, and updating a table using Java SDK. During table creation, we used two functions to create the local and global secondary indexes that we will discuss now.

Note

Projection helps the programmer to decide which attributes have to be added to the secondary index.

Understanding the secondary index and projections should go hand in hand because of the fact that a secondary index cannot be used efficiently without specifying projection. In this chapter, we will cover the following topics:

  • Global secondary indexes

  • Local secondary indexes

  • Projection

  • Item sharding

  • Best practices

The use of projection in DynamoDB is pretty much similar to that of traditional databases. Before learning about projection, go through Chapter 2, DynamoDB Interfaces, and Chapter 3, Tools and Libraries of AWS DynamoDB, which deal with the DynamoDB data model where we use some basics...

Secondary indexes


A quick question: while writing a query in any database, keeping the primary key field as part of the query (especially in the where condition) will return results much faster compared to the other way. Why? This is because of the fact that an index will be created automatically in most of the databases for the primary key field. This is the case with DynamoDB also. This index is called the primary index of the table. There is no customization possible using the primary index, so the primary index is seldom discussed.

In order to make retrieval faster, the frequently-retrieved attributes need to be made as part of the index. However, a DynamoDB table can have only one primary index and the index can have a maximum of two attributes (hash and range key). So for faster retrieval, the user should be given privileges to create user-defined indexes. This index, which is created by the user, is called the secondary index. Similar to the table key schema, the secondary index also...

Projection


Once we have an understanding of the secondary index, we are all set to learn about projection. While creating the secondary index, it is mandatory to specify the hash and range attributes, based on which the index is created. Apart from these two attributes, if the query wants one or more attribute (assuming that none of these attributes are projected into the index), then DynamoDB will scan the entire table. This will consume a lot of throughput capacity and will have comparatively higher latency.

The following is the table (with some data) that is used to store book information:

Here are few more details about the table:

  • The BookTitle attribute is the hash key of the table and local secondary index

  • The Edition attribute is the range key of the table

  • The PubDate attribute is the range key of the index (let's call this index IDX_PubDate)

Local secondary index

While creating the secondary index, the hash and range key of the table and index will be inserted into the index; optionally...

Item sharding


Sharding, also called horizontal partitioning, is a technique in which rows are distributed among the database servers to perform queries faster. In the case of sharding, a hash operation will be performed on the table rows (mostly on one of the columns) and, based on the hash operation output, the rows will be grouped and sent to the proper database server. Take a look at the following diagram:

As shown in the previous diagram, if all the table data (only four rows and one column are shown for illustration purpose) is stored in a single database server, the read and write operations will become slower and the server that has the frequently accessed table data will work more compared to the server storing the table data that is not accessed frequently.

The following diagram shows the advantage of sharding over a multitable, multiserver database environment:

In the previous diagram, two tables (Tbl_Places and Tbl_Sports) are shown on the left-hand side with four sample rows of...

Best practices with secondary indexes


There are four rules to create the secondary index so that our table will function without any hiccups. These rules are as follows:

  • Distributing the load by choosing the correct key

  • Making use of the sparse index

  • Using the global secondary index for quicker retrieval

  • Creating a read replica

Distributing the load by choosing the correct key

In the case of a multipartition table, the data as well as the table's associated indexes will be distributed across servers. This distribution of indexes across servers is determined by the value of one of the attributes of the index. Yes, you're correct: it is decided by the hash key value. Unlike the hash key value (compounded with the range key) of a table, indexes keys can be duplicated. In the case of local secondary indexes, this problem will not occur because of the fact that the table's hash key is the same as that of the index, so the index will be distributed similar to the table. Therefore, this practice must...

Summary


In this chapter, we saw what the local and global secondary indexes are. We walked through projection and its usage with indexes. We also saw the best practices to implement and studied the use cases of secondary indexes.

In the next chapter, we will see how different types of scanning and query operations work in DynamoDB. Also, we will see how they are useful with databases.

lock icon The rest of the chapter is locked
You have been reading a chapter from
DynamoDB Applied Design Patterns
Published in: Sep 2014 Publisher: ISBN-13: 9781783551897
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}