Microsoft Azure Blob Storage

Exclusive offer: get 50% off this eBook here
Microsoft Azure: Enterprise Application Development

Microsoft Azure: Enterprise Application Development — Save 50%

Straight talking advice on how to design and build enterprise applications for the cloud using Microsoft Azure with this book and eBook

$26.99    $13.50
by Nathan A. Duchene Richard J. Dudley | December 2010 | Enterprise Articles Microsoft

This article, by Richard J. Dudley & Nathan A. Duchene, authors of Microsoft Azure: Enterprise Application Development, is about the Blob Storage service and how to interact with blobs using either a .NET client library or REST services.

In movie mythology, blobs are ever-growing creatures that consume everything in their path. In Azure, blobs just seem to be the same. A blob, or binary large object, is an Azure storage mechanism with both streaming and random read/write capabilities. Blob Storage is accessed via a .NET client library or a rich REST API, and libraries for a number of languages, including Ruby and PHP, are available. With the addition of the Windows Azure Content Delivery Network, blobs have become a very functional and powerful storage option.

 

Microsoft Azure: Enterprise Application Development

Microsoft Azure: Enterprise Application Development

Straight talking advice on how to design and build enterprise applications for the cloud

  • Build scalable enterprise applications using Microsoft Azure
  • The perfect fast-paced case study for developers and architects wanting to enhance core business processes
  • Packed with examples to illustrate concepts
  • Written in the context of building an online portal for the case-study application

 

        Read more about this book      

(For more resources on this subject, see here.)

Blobs in the Azure ecosystem

Blobs are one of the three simple storage options for Windows Azure, and are designed to store large files in binary format. There are two types of blobs—block blobs and page blobs. Block blobs are designed for streaming, and each blob can have a size of up to 200 GB. Page blobs are designed for read/write access and each blob can store up to 1 TB each. If we're going to store images or video for use in our application, we'd store them in blobs. On our local systems, we would probably store these files in different folders. In our Azure account, we place blobs into containers, and just as a local hard drive can contain any number of folders, each Azure account can have any number of containers.

Similar to folders on a hard drive, access to blobs is set at the container level, where permissions can be either "public read" or "private". In addition to permission settings, each container can have 8 KB of metadata used to describe or categorize it (metadata are stored as name/value pairs). Each blob can be up to 1 TB depending on the type of blob, and can also have up to 8 KB of metadata. For data protection and scalability, each blob is replicated at least three times, and "hot blobs" are served from multiple servers. Even though the cloud can accept blobs of up to 1 TB in size, Development Storage can accept blobs only up to 2 GB. This typically is not an issue for development, but still something to remember when developing locally.

Page blobs form the basis for Windows Azure Drive—a service that allows Azure storage to be mounted as a local NTFS drive on the Azure instance, allowing existing applications to run in the cloud and take advantage of Azure-based storage while requiring fewer changes to adapt to the Azure environment. Azure drives are individual virtual hard drives (VHDs) that can range in size from 16 MB to 1 TB. Each Windows Azure instance can mount up to 16 Azure drives, and these drives can be mounted or dismounted dynamically. Also, Windows Azure Drive can be mounted as readable/writable from a single instance of an Azure service, or it can be mounted as a read-only drive for multiple instances. At the time of writing, there was no driver that allowed direct access to the page blobs forming Azure drives, but the page blobs can be downloaded, used locally, and uploaded again using the standard blob API.

Creating Blob Storage

Blob Storage can be used independent of other Azure services, and even if we've set up a Windows Azure or SQL Azure account, Blob Storage is not automatically created for us. To create a Blob Storage service, we need to follow these steps:

  1. Log in to the Windows Azure Developer portal and select our project.
  2. After we select our project, we should see the project page, as shown in the next screenshots:
  3. Clicking the New Service link on the application page takes us to the service creation page, as shown next:
  4. Selecting Storage Account allows us to choose a name and description for our storage service. This information is used to identify our services in menus and listings.
  5. Next, we choose a unique name for our storage account. This name must be unique across all of Azure—it can include only lowercase letters and numbers, and must be at least three characters long.
  6. If our account name is available, we then choose how to localize our data. Localization is handled by "affinity groups", which tie our storage service to the data centers in different geographic regions. For some applications, it may not matter where we locate our data. For other applications, we may want multiple affinity groups to provide timely content delivery. And for a few applications, regulatory requirements may mean we have to bind our data to a particular region.
  7. Clicking the Create button creates our storage service, and when complete, a summary page is shown. The top half of the summary page reiterates the description of our service and provides the endpoints and 256-bit access keys. These access keys are very important—they are the authentication keys we need to pass in our request if we want to access private storage or add/update a blob.
  8. The bottom portion of the confirmation page reiterates the affinity group the storage service belongs to. We can also enable a content delivery network and custom domain for our Blob Storage account.
  9. Once we create a service, it's shown on the portal menu and in the project summary once we select a project.
  10. That's it! We now have our storage services created.

We're now ready to look at blobs in a little more depth.

Microsoft Azure: Enterprise Application Development Straight talking advice on how to design and build enterprise applications for the cloud using Microsoft Azure with this book and eBook
Published: December 2010
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:
        Read more about this book      

(For more resources on this subject, see here.)

Windows Azure Content Delivery Network

Delivering content worldwide can be a challenge. As more and more people gain access to the Internet, more and more people (hopefully) will be visiting our site. The ability to deliver content to our visitors is limited by the resources we've used for our application. One way of handling bottlenecks is to move commonly used files (such as CSS or JavaScript libraries) or large media fi les (such as photos, music, and videos) to another network with much greater bandwidth, and with multiple locations around the world. These networks are known as Content Delivery Networks (CDNs), and when properly utilized to deliver content from a node geographically closer to the requester, they can greatly speed up the delivery of our content.

The Windows Azure Content Delivery Network is a service that locates our publicly available blobs in data centers around the world, and automatically routes our users' requests to the geographically closest data center. The CDN can be enabled for any storage account, as we saw in the service setup.

To access blobs via the CDN, different URL is used than for standard access. The standard endpoint for our sample application is http://jupitermotors.blob.core.windows.net. When we set up CDN access, our service is assigned a GUID, and CDN access is through a generated URL, which will be assigned in our Windows Azure Developer portal when we enable the feature. A custom domain can also be used with the CDN.

Blobs are cached at the CDN endpoints for a specified amount of time (default, 72 hours). The time-to-live (TTL) is specified as the HTTP Cache-Control header. If a blob is not found at the geographically closest data center, the blob is retrieved from the main Blob Storage and cached at that data center for the specified TTL.

Blob Storage Data Model

The Blob Storage Data Model is a simple model consisting of four different components: a storage account, containers, blobs, and blocks or pages. A container is a way to organize blobs. Think of a container as a "folder" that can hold many "files". These "files" are blobs. A blob consists of one or more blocks or pages of data. In the following diagram, we can see a visual representation of a container, blobs, and blocks. Our storage account can hold an unlimited number of containers, and each container can hold an unlimited number of blobs. Each blob, as mentioned above, can be either 200 GB or smaller (block blob) or up to 1 TB (page blob). Each block in a block blob can be up to 4 MB in size, which implies that a 200 GB block blob will contain a tremendous number of blocks.

Blob Storage

There are two mechanisms for accessing Blob storage—the REST-ful Blob Storage API and a .NET client library called the StorageClient Library. Documentation for the REST-ful API can be found at http://msdn.microsoft.com/en-us/library/dd135733.aspx, whereas the StorageClient Library documentation can be found at http://msdn.microsoft.com/en-us/library/ee741723.aspx.

Representational State Transfer

What is REST? REST stands for Representational State Transfer, and even if the term is not familiar, the concepts probably are. REST architecture forms the basis for the World Wide Web. In REST, a client sends a document to a server, called as request, and the server replies with another document, called as response. Both the request and the response documents are "representations" of either the current or intended "state". The state in this context is the sum total of the information on the server. A request to list all the posts in a forum receives a response describing the current state of the forum posts. A request containing a reply to one of those posts represents the intended state, as it changes the forum's information. Systems built on these concepts and utilizing a set of HTTP verbs are described as REST-ful. For more information on REST, a good starting point is http://en.wikipedia.org/wiki/Representational_State_Transfer.

The Blob Storage API

Now that we have an idea of what REST is, we now understand why it's important that the Blob Storage API is built on a "RESTful" interface. The Blob Storage API uses the HTTP/REST operations PUT, GET, DELETE, and HEAD. These operations perform the following functions:

  • PUT: This command will insert a new object, or if the object already exists, it will overwrite the old object with the new one.
  • GET: This command will retrieve an object.
  • DELETE: This command will delete an object.
  • HEAD: This command will retrieve properties and metadata.

Using the Blob Storage API, we can work with containers, blobs, and blocks via HTTP/REST operations. As we examine the API in the coming sections, we will notice that many of the operator/request URI combinations are similar. The magic happens with the request headers and request bodies.

Working with containers using the REST interface

We are able to perform a number of actions with containers using the Blob Storage API. Containers help us to:

  • List all containers for our storage account
  • Create new containers
  • Retrieve all container properties and metadata
  • Set metadata on containers
  • Get and set the access control list (ACL) for a container
  • Delete a container (and its contents)

Working with containers using the StorageClient library

The CloudBlobClient class (http://msd n.microsoft.com/en-us/library/ee758637.aspx) is the class used to interact with blob containers. The CloudBlobContainer class (http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storageclient.cloudblobcontainer_members.aspx) acts on a single container.

Microsoft Azure: Enterprise Application Development Straight talking advice on how to design and build enterprise applications for the cloud using Microsoft Azure with this book and eBook
Published: December 2010
eBook Price: $26.99
Book Price: $44.99
See more
Select your format and quantity:
        Read more about this book      

(For more resources on this subject, see here.)

Working with blobs

Working with blobs using the REST interface is as easy as working with containers. The same PUT/GET/DELETE/HEAD operators are used with slightly different request URIs. In the client library, the CloudBlob class (http://msdn.microsoft.com/enuslibrary/ee773197.aspx) is used to interact with individual blobs. Another useful class for working with blobs is the BlobRequest class (http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storageclient.protocol.blobrequest.aspx). The BlobRequest class has many similar methods to the CloudBlob class , and also includes the methods for working with blocks in block blobs.

Summary

Blob Storage is an amazing storage mechanism in Windows Azure. Between the scalability factors, authorization security settings, and the Blob Storage API for easy access, this truly is a long-term solution for anyone wishing to utilize Windows Azure for any application or service. In this article, we gained an overview of the two types of blobs, created a storage service for our project, and examined the API and client library used to interact with containers and blobs.


Further resources on this subject:


About the Author :


Nathan A. Duchene

Nathan Duchene is an Application Developer for Armada Supply Chain Solutions, a logistics company based in Pittsburgh, PA. He has worked with a list of Microsoft technologies, including ASP.NET, Silverlight, SharePoint, BizTalk, and SQL Server. Nathan also enjoys playing numerous sports, volunteering in the community, and spending time with his family and closest friends in his spare time.

Richard J. Dudley

Richard Dudley is a Senior Application Developer for Armada Supply Chain Solutions, a logistics company based in Pittsburgh, PA. Richard has been developing with Microsoft technologies since 1998, and today works with a variety of technologies, including SQL Server, ASP.NET, SharePoint and BizTalk. With his wife Kathy, Richard is also co-owner of The Bloomery Florist in Butler, PA

Books From Packt


Microsoft Dynamics NAV 2009 Application Design
Microsoft Dynamics NAV 2009 Application Design

Microsoft Silverlight 4 Data and Services Cookbook
Microsoft Silverlight 4 Data and Services Cookbook

Microsoft Dynamics GP 2010 Implementation
Microsoft Dynamics GP 2010 Implementation

Microsoft Dynamics NAV Administration
Microsoft Dynamics NAV Administration

Applied Architecture Patterns on the Microsoft Platform
Applied Architecture Patterns on the Microsoft Platform

Getting Started with Oracle BPM Suite 11gR1 – A Hands-On Tutorial
Getting Started with Oracle BPM Suite 11gR1 – A Hands-On Tutorial

IBM Cognos 8 Report Studio Cookbook
IBM Cognos 8 Report Studio Cookbook

PostgreSQL 9.0 High Performance
PostgreSQL 9.0 High Performance


Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software