Home Data Learning Azure DocumentDB

Learning Azure DocumentDB

By Riccardo Becker
books-svg-icon Book
eBook $29.99 $20.98
Print $38.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $29.99 $20.98
Print $38.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
About this book
Learning DocumentDB adopts a practical, step-by-step approach to help you learn the basics of DocumentDB and use your new-found abilities in real-life scenarios and enterprise solutions. We start with the absolute basics, such as setting up a DocumentDB environment, and guide you through managing your databases, and executing simple and complex queries. Next, we explain how to work with DocumentDB using the open REST protocol, and demonstrate how JavaScript works with DocumentDB. We’ll also show you how to authenticate and execute queries. Moving on, you’ll find out how to use DocumentDB from within Node.js to kick-start your Node.js projects. Next, you’ll discover how to increase the performance of your DocumentDB database and fine-tune it. Finally, you’ll get to grips with using DocumentDB in conjunction with other services offered from the Microsoft Azure platform.
Publication date:
November 2015
Publisher
Packt
Pages
152
ISBN
9781783552467

 

Chapter 1. Getting Started with DocumentDB

Until recently, the most common answer to the question "Where do I store my application information?" was in a relational database, obviously. The answer to this simple yet meaningful question is not so straightforward anymore.

NoSQL databases are becoming more and more popular and DocumentDB is one of them. In August 2014, Scott Guthrie officially announced the first preview version of DocumentDB. DocumentDB is a NoSQL database service offered by Microsoft. It is delivered as a managed service on Azure. This means that we no longer have to manage any infrastructure; we can just take it from the tap and pay per use. DocumentDB is a schema-free store, which means that we can store any kind of JSON document inside the store and work with the data as we used to in traditional SQL databases.

In this chapter, we will do the following:

  • Learn what DocumentDB is all about

  • Look at the data model

  • Make a comparison with other non-SQL technologies

  • Learn about the pricing model

  • Build a console application that connects to a database

This book is aimed at architects, developers, database administrators, and IT professionals who want to learn and understand the breadth of DocumentDB.

 

What is DocumentDB?


The short answer to this question is that DocumentDB is a managed JSON document database service. But what is the impact on our programming paradigms? How can we use it? Why should we use it? Can it really make our life easier? The answers to these kinds of questions are a bit more involved and need additional clarification.

This section describes the fundamentals of DocumentDB and can help you decide whether or not it will be a good fit for your solution.

Microsoft built DocumentDB from the ground up because the feedback they got from customers was that they "…need a database that can keep pace with their rapidly evolving applications…." Schema-free databases are increasingly popular, but running these on our premises can be expensive and difficult to scale. Combining this with the need for rich querying and transactions still being available, Microsoft decided to build DocumentDB.

This brings us to the longer version of our answer, which is that DocumentDB is a "…a massively scalable, schema-free database with rich query and transaction processing using the most ubiquitous programming language, JavaScript, data model (JSON), and transport protocol (HTTP)…" (http://blogs.msdn.com/b/documentdb/archive/2014/08/22/introducing-azure-documentdb-microsoft-s-fully-managed-nosql-document-database-service.aspx).

The characteristics of a schema

As stated before, NoSQL databases are gaining popularity and are slowly replacing traditional relational databases. The main characteristics of a NoSQL database are listed next:

  • Schema-less, with the ability to store everything

  • Non-relational

  • Extremely scalable

Note

Besides DocumentDB databases, there are other NoSQL databases available, such as graphs and key-value databases. We will study a comparison later in this chapter.

Having no schema (or predefined structure like tables and columns) allows us to store everything. This also includes attachments, user-defined functions, stored procedures, triggers, and more. The only restriction is that the information has to be in valid JSON.

Having JavaScript at the core

The SQL language that can be used to query and manipulate DocumentDB is based on JavaScript. Having JavaScript at the core means that we do not need to learn new techniques or languages, and our current knowledge of JavaScript can be applied immediately. Using JavaScript is a natural way of working with JSON. JSON parsers are perfectly capable of converting query results into variables, manipulating them, and writing them back to the database. Besides working as a client with JavaScript, the internals are also based on JavaScript. The following entities are written in JavaScript as well:

  • Stored procedures (SPs): These are executed by issuing an HTTP POST request. Inside the SP, the elements of the designated document(s) are copied to ordinary JavaScript variables. The logic inside the SP then manipulates the data and when the SP finishes, the values are persisted in the document(s) again.

  • User-defined functions (UDFs): The difference between UDFs and SP is that UDFs do not manipulate databases or documents themselves. A UDF encapsulates logic or business rules that can be called from SP or queries and can help extend the query language. A good example of a UDF is a function called calculateAge() that takes the date of birth of a person and returns their age as a value. The calculateAge() function can be used from a query returning only those persons that are older than 40 years. The query is as follows:

    SELECT * from people p where calculateAge(p.dob) > 40, 
  • Triggers: A trigger is a piece of JavaScript code (comparable to UDFs and SPs), but which is only invoked after some event that happens inside your database. A document being created or deleted could result in a trigger being executed. Triggers can be executed before or after the actual event happens. When a trigger fails or raises an exception, the actual operation is aborted and the transaction is not committed but rolled back. This is useful when we need to validate the incoming data to keep our documents consistent.

We will provide extensive examples of SPs, user-defined functions and triggers later in this book.

Indexing a document

In traditional relational databases, the DBA or developer needs to choose the (clustered) indexes. Choosing the right indexing strategy is vital for the performance and consistency of the database.

In DocumentDB, we do not need to choose the index ourselves. In fact, all information inside a document is indexed. This means that we can query on any attribute that is available inside the document. We can choose different indexing policies, but for most applications the default indexing policy will be the best choice between performance and storage efficiency. We can reduce storage space by excluding certain paths within the document used for indexing.

The indexing process inside DocumentDB treats the documents as trees. There needs to be a top node that is the entry point for all the fields inside the document. Imagine a document containing information about a person in the following JSON representation:

{
  "firstname": "John",
  "lastname": "Doe",
  "dob", "01-01-1960",
  "hobbies":
  [
    { "type":"sports", "description":"soccer"},
    { "type":"reading", "preferences":
      [
        { "type":"scifi"},
        { "type":"thriller"}
      ]
    }
  ]
}

This JSON snippet describes a person, John Doe, who was born on January 1, 1960, and has two hobbies, sports and reading. His reading hobby focuses on the sci-fi and thriller genres.

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

A JSON document can be depicted like this:

The blue squares are nodes that are implicitly added by the system and do not influence our data model. The figure shows that documents are internally represented as trees. As you can see, the nodes that describe a hobby do not necessary have to be the same in schema. Go ahead and try to build this model in a traditional relational database system!

DocumentDB as a service

Microsoft offers DocumentDB as part of their online offerings on the Microsoft Azure platform. Their as-a-service approach enables developers to start using new technologies immediately.

Understanding performance

The performance of our DocumentDB system is influenced by a performance level. Performance levels are set on a collection and not a database. This enables fine-tuning of your environment, giving the appropriate performance boost to the right resources. Setting the performance level influences the number of so-called request units. A request unit is a measure for the resources (CPU, memory) needed to perform a certain operation.

There are three performance levels:

  • S1: Allows up to 250 request units per second

  • S2: Allows up to 1,000 request units per second

  • S3: Allows up to 2,500 request units per second

We need to choose the performance level carefully, since it comes with a price impact. We will discuss the pricing of DocumentDB later in this chapter.

Handling transactions

DocumentDB also supports transactions providing Atomicity, Consistency, Isolation, Durability (ACID) guarantees. Atomicity enables all operations to be executed as a single piece of work, all being committed at once or not at all. Consistency implies that all data is in the right state across transactions. Isolation makes sure that transactions do not interfere with each other, and durability ensures that all changes that are committed to the database will always be available.

Since JavaScript is executing under snapshot isolation belonging to the collection, SPs and triggers are executed within the same scope, enabling ACID for all operations inside SPs and triggers. If an error occurs in the JavaScript logic, the transaction is automatically rolled back.

Common use cases

Now that we have seen a little of DocumentDB, how can we decide whether DocumentDB is applicable for our own problem scenario? In which scenarios is it a good fit and are there any trade-offs?

Building the Internet of Things

A good example of a problem domain in which DocumentDB fits is the domain of the Internet of Things (IoT). The IoT is all about ingesting, egressing, processing, and storing data (visit https://en.wikipedia.org/wiki/Internet_of_Things). It involves data flowing to and from devices, backend services processing that data or controlling devices, storage services persisting that data, or running statistical analysis or analytics on that data. Because DocumentDB can connect to HDInsight (http://azure.microsoft.com/en-us/services/hdinsight/) and Hadoop, the data can be analyzed easily.

Another good area in the IoT domain is device registration. Each and every device in the field is described inside a single document and stored in DocumentDB. These documents contain information for the device to be able to play the game of IoT, having keys and endpoints to communicate with and enable ingress and egress dataflows.

Throughout this book, we will also take the IoT domain as our main example domain. Examples and code snippets will focus on this area because it is a good area to project the possibilities of DocumentDB on.

Storing user profile information

Storing user profile information inside DocumentDB can be really helpful when it comes to personalized user interfaces or other preferences that can influence an application's behavior or user interface settings.

Note

JavaScript can easily interpret JSON data and is therefore an excellent candidate for describing the markup of a personalized user interface. Extending this thought, the schema-free approach of DocumentDB also makes it an excellent candidate for a CMS system.

Every user is reflected in a single document that describes all user preferences. The list of preferences can be easily extended by adding information to the document. Consider that users authenticate at an authentication service, for example, Azure Active Directory, Facebook, or Twitter, and that these services return a claim set, including a unique identifier called nameidentifier. This field is an excellent candidate for providing the unique entry point in our DocumentDB system and retrieving the user's profile information after logging in.

Logging information

A well-designed system usually emits logging information in large quantities and contains different types of information. Logging information is straightforward and contains information about a specific event, for example, a user logging in to the system, an exception raised by the system, or an audit trail record that needs to be persisted.

Because DocumentDB automatically indexes all documents, querying data and finding fault causes can be very quick. You can take DocumentDB information offline and store it in a datacenter for further analysis with tools like Hadoop or Power BI.

Building mobile solutions

Building and releasing mobile solutions is tough because we might have millions of customers. Using a schema-free database, it is easier to release new apps with additional data while still being able to service your old versions as well. Remember the troubles we had releasing a new schema of our SQL Server or Oracle database? Adding new tables and columns because of new features, and writing conversion scripts for every new release of the system?

By using a JSON document, we can easily add or remove information, release at a faster pace, and enable development in sprints—changing the data each sprint without the pain of conversion scripts.

Of course, the powerful scaling of DocumentDB is also a great help when building global, mobile apps servicing millions of users!

 

Exploring the data model


The internals of DocumentDB can themselves be described in a JSON document itself. The following figure displays a hierarchy of DocumentDB and its entities. This is called the DocumentDB resource model and all the entities are called resources.

DocumentDB account

The main entry point is obviously a DocumentDB account. You need to have an account before you can start working with it. An account can contain databases and, as part of a preview feature, an amount of blob storage for attachments.

All resources are accessible through a logical URI, for example, the database with the name persons can be addressed through the logical URI /dbs/persons and the document describing the person John Doe, which has an ID of 12345, can be addressed by the local URI /dbs/persons/<collectionid>/docs/12345.

Creating databases

A database is a container where documents are stored inside collections. Part of the database also forms a user's container. The user's container has a set of permissions, and the permissions to access collections, UDFs, triggers and SP are set on a database level. We can grant access to users in a fine-grained manner for accessing collections and documents.

A database is scaled elastically and does not need any interference from the account owner. It can scale from megabytes to petabytes. The data is stored on an SSD disk, providing fast access to your documents.

Databases are spanned implicitly across different underlying machines in order to provide the level of scaling we need.

Administering users

A user stored in DocumentDB is an abstraction of the concept user. A user is not a person logging in, but a set of permissions. Eventually, a DocumentDB user can be mapped to an Active Directory user or some third-party identity management system.

A simple, straightforward way of designing the user model is to create exactly one user per tenant or customer. That user only has permission to access collections and documents inside one database. This is the database belonging to the designated tenant and/or customer. This is a flat user model, but it is also possible to create user identities corresponding to actual users representing certain personas. This can give you more fine-grained control but will also increase the burden of user administration.

Users can be managed through the Azure portal (portal.azure.com) and also via the rich REST API or client SDKs that are provided by Microsoft.

Setting permissions

Implicitly, DocumentDB contains two types of roles: an administrator and a user. The administrator is the one that has the permission to manage and manipulate database accounts, databases, users, and permissions. These are considered the administrative resources, analogous to the metadata describing the full DocumentDB ecosystem. The administrator is provided with a master key. The master key is part of the DocumentDB account and is provided to the one setting up the account.

A user, on the other hand, is the person who manipulates actual data inside collections and documents, or changes UDFs (application resources). A user gets a resource key that provides access to specific application resources like databases and collections.

Managing collections

A collection can be described as a container for all the documents, but it is also a unit of scale. Adding collections will result in some SSD storage to be allocated for that particular collection. As we saw before, setting the performance level is done on a collection level. Inside your database, you can have multiple collections, each having their own performance level (S1, S2, or S3). For example, we could have a UserProfile collection containing documents with specific profile information like addresses, images, UI preferences, and so on. This collection is queried once a user logs in to your system and the profile information is retrieved from the UserProfile collection. Imagine we have another collection containing all the products that can be ordered. This collection will be accessed more frequently, hopefully, and therefore we can set an S3 level on this collection to provide the best performance for our potential buyers.

Collections grow and shrink implicitly when documents are added or removed. There is no need to allocate space or do other preconfiguration steps.

 

DocumentDB versus other databases


This section compares DocumentDB with other (non-)SQL technologies. The comparison is made with MongoDB and Azure Table storage.

Azure Table storage

Table storage is a non-SQL tabular based storage mechanism enabling you to store rows and columns inside a table. A table is not fixed, meaning that different rows can have different columns. Azure Table storage is a perfect fit for storing large amounts of data, although it is non-relational. There are no mechanisms like foreign keys, triggers, or user-defined functions.

MongoDB

MongoDB is also a document database (NoSQL), which means that it is schema-free, enables high performance and high availability, and has the ability to scale. MongoDB is open source, and is built around documents and collections. The documents are compiled of sets of key-value pairs, while collections also contain documents. Compared to DocumentDB, MongoDB uses BSON instead of JSON.

Comparison chart

The following table provides a high-level comparison on some key features:

Feature

DocumentDB

MongoDB

Table storage

Model

Document

Document

Rows and columns

Database schema

Schema-free

Schema-free

Schema-free

Triggers

Yes

No

No

Server side scripts

Yes, JavaScript

Yes, JavaScript

No

Foreign keys

*N/A

N/A

N/A

Indexing

Potentially on property

Potentially on every property

Partition key and row key only

Transactions

Yes, supports ACID

No

Limited, using batching

Hosting

On Microsoft Azure only, offered as a service

Can be on-premise or on a virtual machine, not offered as a service

On Microsoft Azure, offered as a service.

DocumentDB does not offer referential integrity by design. There is no concept of foreign keys. Integrity can be enforced by using triggers and SPs.

The role of the Database Administrator is still needed to manage DocumentDB. We still need someone to overlook our databases and collections. Some common tasks a DBA for a document might perform are as follows:

  • Creating and managing databases

  • Creating and managing collections

  • Getting responsibility on scaling, partitioning, and sharding

  • Defining and maintaining SPs, user-defined functions, and triggers

  • Managing users and permissions

  • Measuring performance

 

Understanding the price model


This section provides a brief overview of how your bill is influenced by the way you use DocumentDB. There are a few factors that determine the pricing:

  • Having a DocumentDB account

  • Number of collections inside a database

  • Performance level

  • Capacity units

Account charges

When you set up a DocumentDB account, you will be billed immediately. An empty account with no databases and hence no collections will be charged for a single S1 collection, at around $25 per month. The reason that you are charged while not having any collections or documents is that Microsoft needs to reserve a DNS and authorization scope for the account.

Number of collections

Collections are billed by the hour. Having a collection for only 10 minutes will still incur charges for a whole hour. An amount of 10 GB is included for all tiers.

The following table defines the standard characteristics per performance level:

Performance level

SSD storage

Request units

Price per hour

S1

10 GB

250 per second

$0.034

S2

10 GB

1,000 per second

$0.067

S3

10 GB

2,500 per second

$0.134

Request units are calculated based on the amount of resources that are needed for the operation performed. When more CPU, IO, and memory is needed for a certain operation, more request units are calculated. The number of request units needed for each operation is returned in the response's header (x-ms-request-charge). By reading this value, you can keep track of the usage. If you exceed the number of request units, additional operations will be throttled.

To have fine-grained control over the performance of your collections, you could do the housekeeping of consumed Requests Units (RUs) yourself and check if you often exceed the maximum number of RUs. If so, upgrading the performance level for your collection might be a good idea.

Note

It is possible to upgrade and downgrade a collection from S1 to S2 or S3, but the charges are for the highest tier. Switching from S1 to S3 within 1 hour will therefore be billed at $0.134.

Request Units

The number of RUs needed for an operation depends on the following factors:

  • Size of the document: Larger documents increase the consumption of RUs.

  • Number of properties: More properties increase the consumption of RUs. When you use data consistency (we will dive into this concept later on), more RUs will be consumed.

  • Indexes: When more properties are indexed, more RUs are needed. It is good practice to investigate the actual indexes you need for you scenario. Also, documents are indexed by default; turning this feature off will save RUs. SP and triggers also consume RUs based on the metrics mentioned previously.

Understanding storage

By default, a collection is provisioned with 10 GB of storage. Documents consume storage space, but indexes also fill up the space of a collection. If you need more storage space, you need to create a different collection.

Expanding resources

Microsoft offers the ability to contact Azure support from the Azure blade portal (portal.azure.com). If you need specific adjustments that you cannot manage from the portal or that are not supported by default, contact Azure support.

Click on the New Support button and follow the wizard that shows up. Make sure you choose the Quotas request type and enter your request details. The following table shows the limits that can be stretched by Azure support:

Database accounts

5

Number of SPs, triggers and UDFs per collection

25 each

Maximum collections per database account

100

Maximum document storage per database (100 collections)

1 TB

Maximum number of UDFs per query

1

Maximum number of JOINs per query

2

Maximum number of AND clauses per query

5

Maximum number of OR clauses per query

5

Maximum number of values per IN expression

100

Maximum number of collection created per minute

5

 

Building your first application


This paragraph provides a step-by-step approach to building a console application using Visual Studio 2015 that utilizes the basics of DocumentDB. We will perform the following steps:

  1. Create a DocumentDB account.

  2. Create a database.

  3. Create a collection.

  4. Build a console application that connects to DocumentDB and saves a document.

Provisioning an account

To create a DocumentDB account, you need to go to the Microsoft Azure portal. If you don't have a Microsoft Azure account yet, you can get a trial version at https://azure.microsoft.com/en-us/pricing/free-trial/.

After logging in to the Azure portal, go ahead and create your first DocumentDB account. For now, you only need to come up with a name.

After clicking on the Create button, your DocumentDB account will be provisioned. This process might take some time to finish.

After provisioning, your account is ready for use. Select the account you have just created and you will get an overview.

On the overview blade of your account, you will see a lot of information. For now, the most important information is located in the settings blade on the right-hand side. From this blade, you can retrieve keys and a connection string. We need this information if we want to start building the console application. Select the DocumentDB option, copy the URI, and copy the primary key.

Creating a database

In order to be able to create collections, we need a database first. Creating a database is straightforward as it only needs a name as input. Click the Add Database button and enter a meaningful name. After selecting OK, your database is provisioned. On the left blade you can scroll down and locate your new database.

Creating a collection

As we have seen before, a collection is created inside a database. Selecting your database gives you the ability to add a collection. When the Add Collection option is selected, you need to pick the right performance level (or tier). For this demo, the S1 tier is more than sufficient.

Now that we have our DocumentDB account, a database, and a collection, we can start building our first application.

Note

Creating databases and collections can also be done through the REST API or the designated SDKs.

Building a console application

This sample is built using Visual Studio 2015. If you do not have Visual Studio 2015, you can download the free version Visual Studio 2015 Community from https://www.visualstudio.com/en-us/products/visual-studio-express-vs.aspx.

Setting up a solution

Here are the steps for creating a Visual Studio solution containing a console application that will demonstrate the basic usage of DocumentDB:

  1. Start Visual Studio.

  2. Go to File | New Project and then click on the Console Application template.

  3. Name your project MyFirstDocDbApp.

    Visual Studio now creates a console application.

  4. In order to work with DocumentDB, we need to pull in a NuGet package. Right-click on your project file and select Manage NuGet Packages. Search for the Microsoft Azure DocumentDB Client Library.

  5. Select the right package in the search results and click Install. Your project is now ready to use the DocumentDB Client Library.

Saving a document

Now that we have set up a solution, created a project, and enabled the right .NET library to manage DocumentDB, we are going to write some C# code.

Note

Keep in mind that although the code samples in this book are mostly in C#, you can also use the programming language of your choice. There are SDKs available for multiple platforms (Java, Python, Node.js, and JavaScript). If yours is not supported, you could always use the REST API.

  • Add the following using statements to the top of the program.cs file:

    using Microsoft.Azure.Documents;
    using Microsoft.Azure.Documents.Client;
    using Microsoft.Azure.Documents.Linq;
    using Newtonsoft.Json;
  • We need the URI and primary key that we retrieved in the previous paragraph.

After writing a few lines of C# code, we have the code snippet ready. It performs the following tasks:

  • Makes a connection to the DocumentDB account

  • Finds the database that is created in the portal

  • Creates a collection named testdevicehub

  • Saves a document to this collection

The code is as follows:

private static async Task CreateDocument()
  {
    //attach to DocumentDB using the URI and Key from the Azure portal
    DocumentClient client = new DocumentClient(new Uri(docDBUri), key);
    //query for the right database inside the DocDB account
    Database database = client.CreateDatabaseQuery().Where(d => d.Id == "devicehub").AsEnumerable().FirstOrDefault();
    //find the right collection where we want to add the document
    DocumentCollection collection = client.CreateDocumentCollectionQuery((String)database.SelfLink).
        ToList().Where(cl => cl.Id.Equals("testdevicehub")).FirstOrDefault();
    //create a simple document in the collection by providing the DocumentsLink and the object to be serialized
    //and stored
    await client.CreateDocumentAsync(
      collection.DocumentsLink,
      new PersonInformation
      {
        FirstName = "Riccardo",
        LastName = "Becker",
        DateOfBirth = new DateTime(1974, 12, 21)
      }
    );
  }

Replace the values of docDBUri and the key with your information and run the console application. You have just created your first document.

Now, go to the Azure portal again and open the query documents screen. You need to select the designated collection to enable this option. Running the base query returns the document that we have just created:

select * from c

Here's the screenshot:

As you can see, the document contains more than just the fields from the class PersonInformation. Here is a brief explanation of these fields:

  • id: This is the unique identifier for the document. In the application we have just created, the ID is automatically generated and is represented by a GUID.

  • _rid: This resource ID is an internally used property.

  • _ts: This is a property generated by the system, and it contains a timestamp.

  • _self: This is generated by the system, and it contains a unique URI pointing to this resource document.

  • _etag: This is a system-generated property containing an ETag that can be used for optimistic concurrency scenarios (if somebody updates the same document in the meantime, the ETag will differ and your update will fail).

  • _attachments: This is generated by the system, and it contains the path for the attachments resource belonging to this document.

 

Summary


In this chapter, we covered the basics of DocumentDB. We saw that we can literally store everything and there is no predefined schema we need to comply with. The Azure portal offers some interesting blades for us to create, configure, and manage DocumentDB resources and offers some quick-starts to help you get started immediately. The internals of DocumentDB were discussed and we got a nice insight of the data model.

We also saw some common use cases that are applicable for DocumentDB. A small section was dedicated to explain the pricing model and how your bill is affected by actions you can do.

Finally, we started to do a bit of coding and wrote a small C# console application that connects to the database and creates a document. We could see in the Azure portal that the document was actually stored, together with some other interesting metadata.

In the next chapter, we will discuss how to manage and monitor your DocumentDB resources.

About the Author
  • Riccardo Becker

    Riccardo Becker works full time as a principal IT architect for CGI in the Netherlands. He holds several certifications and his background in computing goes way back to 1998, when he started working with good old Visual Basic 5.0 (or was it 6.0?). Ever since, he has fulfilled several roles, such as a developer, lead developer, architect, project leader, practice manager. Recently, he decided to accept the role of a principal IT architect where he focuses on innovation, cutting-edge technology, and specifically on Microsoft Azure, the Internet of Things, and cloud computing in general. In 2007, he joined the Microsoft LEAP program where he got a peek at the move Microsoft was about to make on their road to the cloud. Pat Helland gave him that insight, and since the first release of Microsoft Azure on PDC 2008, he started to focus on it, keeping track of the progress and the maturity of the platform. In the past few years, he has also done a lot of work on incubation with his employer, raising awareness of cloud computing in general and Microsoft Azure in particular.

    Browse publications by this author
Learning Azure DocumentDB
Unlock this book and the full library FREE for 7 days
Start now