Packt+ | Advance your knowledge in tech

You're reading from RavenDB 2.x Beginner's Guide

Product typeBook

Published inSep 2013

PublisherPackt

ISBN-139781783283798

Edition1st Edition

Tools

Visual Studio RavenDB

Concepts

Database Programming

Author (1)

Khaled Tannir

Chapter 4. RavenDB Indexes and Queries

Wherever you use a database, you need some queries with search criteria to retrieve your data from this database. This chapter takes us forward towards querying the data in RavenDB.

In this chapter, we will learn how RavenDB indexes work and why we need them. Then, we will cover the different types of indexes and the problem that RavenDB indexes aim to solve.

You will learn about Map/Reduce and how RavenDB indexes implement this paradigm and use it to retrieve data from the server. With a step-by-step approach, we will create some indexes and learn how to query them.

In this chapter, we will cover:

RavenDB Map/Reduce implementation
RavenDB dynamic indexes
RavenDB static indexes
RavenDB stale indexes

The RavenDB indexes

All storage systems use indexes to find data quickly when a query is processed. In a database system, an index is a data structure that improves data retrieval operations. Therefore, creating a proper index can drastically increase the performance of an application.

An index in a relational database is very similar to an index in the back of a book. When a database server has no index to use for searching, the result is similar to the reader who looks at every page in a book to find a word. The database engine needs to visit every row in the table. In relational database terminology, we call this behavior a table scan, which becomes slower and more expensive as a table grows to thousands or millions of rows.

RavenDB indexes are used to retrieve data from the server but they do not work the same way as relational database indexes work. The main difference is that relational database indexes are schema-based and RavenDB is a schema-less document-oriented database, which means...

RavenDB Map/Reduce implementation

Map/Reduce is a programming model and an associated implementation for processing and generating large datasets. RavenDB indexes are Map/Reduce implementations and allow you to perform aggregations over multiple documents. Indexes use a Map function to specify what to retrieve from the server and optionally use Reduce and Transform functions to specify which results will be returned to the client.

Developer specifies one or more Map function(s) that processes a documents collection to generate a set of intermediate key-value pairs. The intermediate key-value pairs produced by the Map function are buffered in memory.

The Reduce function is not compulsory. An index may have zero or only one Reduce function. The Reduce function reads all intermediate key-value pairs generated by the Map(s) function(s) and aggregates associated values with the same intermediate key. After successful completion, the output of the Reduce function execution is available to the caller...

RavenDB dynamic indexes

When you make a query to RavenDB, the RavenDB query optimizer will search first for indexes matching that query before performing it. In case there is no matching index found, RavenDB automatically creates a temporary index for this query. When a query is performed often, it will optimize itself based on the actual requests coming in, and can decide to promote a temporary index to a permanent one.

Note

Dynamic indexes are Map/Reduce indexes. They have no reduction function. They are just mapping functions, which allow RavenDB to answer queries by knowing how to traverse the document.

Querying dynamic indexes

Dynamic indexes are created automatically on the fly by RavenDB. When querying the server if there are no matching indexes for this query, RavenDB will create a new temporary index and will use it to query the data on the server.

Time for action – querying a dynamic index

We will query the World database and retrieve all Countries for which the Area field is greater than or equal to 1000000. Before creating this query, you will import the Countries.csv file into the World database. After that, you will visualize how RavenDB will perform this task and you will look at the RavenDB logs in the prompt window.

In Management Studio, import the Countries.csv file into the World database.
Create a new Visual Studio project, name it RavenDB_Ch04.
Add a new class, name it City and complete it as follows:
Add a new class named Country and make it look as follows:
Add the RavenDB DocumentStore initialization to the Main() method using the following code snippet:
Note
The World database has been created in Chapter 2, RavenDB Management Studio.
Add this query to the Main() method in the Program class using the following code snippet:
Save all the files and build the solution.
Open the RavenDB installation folder and launch RavenDB server...

Time for action – querying a temporary index

To illustrate how RavenDB uses existing temporary indexes, we will recall the Query<Country>() method we created in the previous section. Also we will create another query using the same parameter, which will use the same temporary index. Then you will analyze the RavenDB logs in the RavenDB prompt window and visualize the Management Studio Indexes screen.

Open RavenDB Management Studio, select the Indexes tab of the World database and ensure that Temp/Countries/ByArea_RangeSortByArea index exists. If not, follow all the steps of the previous section to create the temporary index.
Open the RavenDB_Ch04 solution.
Add the following code snippet to the Main() method in the Program class:
Save all the files, build and run the solution.
Open the RavenDB prompt window and analyze the RavenDB logs.
In Management Studio, select the World database and click on the Indexes tab to display the Indexes screen and verify that there are no new temporary indexes...

Time for action – managing temporary indexes

RavenDB allows you to manage temporary indexes using the server configuration file option. Also, RavenDB will optimize itself by deleting temporary indexes if they have not been used for a given time, or will promote them to permanent indexes if they have been used enough.

These following steps summarize the temporary index management process in RavenDB:

RavenDB looks for appropriate index to use in query.
If found, it will return the most appropriate index.
If not found, it will create an index that will deal with the query.
Return that index as temporary.
If that index is used enough, promote it into an Auto index
Note
Temporary indexes behavior is controlled by these configuration settings: Raven/TempIndexPromotionThreshold and Raven/TempIndexPromotionMinimumQueryCount.
By default, the number of times a temporary index has to be queried before becoming a permanent index is 100. You can change these settings by changing the value of the Raven/TempIndexPromotionMinimumQueryCount...

RavenDB static indexes

RavenDB allows user to manually create and use indexes. These indexes explicitly created are called static indexes or named indexes. A static index allows the use of one or more Map functions. It may include a Reduce function and/or a Transform function. These functions will specify what to retrieve from the server and will be defined using the regular Linq expressions.

Static indexes are more efficient than dynamic indexes. Since dynamic indexes are created on the fly on first user query and are created as temporary indexes, this might be a performance issue on first run. Also, static indexes expose more functionality such as custom sorting, boosting, Full Text Search, Live Projections, spatial search support, and more.

So far we have created some queries so far to retrieve data from the RavenDB server using Linq expression. This can be used the same way to sort or aggregate data and to query specific fields in a document. When using indexes to aggregate data, there...

Time for action – defining a Map function for an index

We will create a new static index and add it to the World database (created in Chapter 2, RavenDB Management Studio). You will create this index using the PutIndex() method. Then you will analyze the RavenDB logs and open the index in Management Studio to view it and execute it.

Open the RavenDB_Ch04 solution in Visual Studio.
Add the following code to the Main() method to create the static index:
Add the following code to the Main() method to query the Cities/CountryCode index:
Save all the files, build and run RavenDB_Ch04.
Select the RavenDB prompt window in Windows Explorer and analyze the RavenDB logs to understand how RavenDB created the index.
In Management Studio, select the World database, click on the Indexes tab and open the Cities/CountryCode index in edit mode and look at the Map function code.
In Management Studio, execute the Cities/CountryCode index and observe the result.

What just happened?

You just created your first static...

Time for action – adding a Reduce function to an index

We will create a new static index and will define its Map and Reduce functions. You will add this index to the World database using the PutIndex() method. Then you will modify the RavenDB_Ch04 project and add a new call to the PutIndex() method. After that you will open the index in Management Studio to view it and execute it.

Open the RavenDB_Ch04 solution in Visual Studio.
Add the following code snippet to the Main() method to create the index:
Tip
While writing the Map and Reduce functions, press Ctrl + Space to display all methods you can set.
Add the following code snippet to the Main() method to query the index:
Save all the files, build and run RavenDB_Ch04.
In Management Studio, open the Cities/CountryPopulation index in edit mode and look at its Map and Reduce functions code.
In Management Studio, execute the Cities/CountryPopulation index and observe the result.

What just happened?

You created the Cities/CountryPopulation static index...

Time for action – adding a TransformResults to the index

We will modify the CountryPopulation index and will define its Map, Reduce, and TransformResults functions. This new index version will aggregate the Population for each Country and will transform the query result to a new format the shape of which is the same as the CountryPopulation class which you will also create. Then you will open the index in Management Studio to execute it and view its result.

Open the RavenDB_Ch04 project in Visual Studio.
Add a new class to the RavenDB_Ch04 project, name it CountryPopulation and make it look like the following code snippet:
Add the new CountryPopulation index definition to the Main() method using the following code:
Add the following code to the Main() method to query the Cities/CountryPopulation index.
Save all the files, build and run the solution.
In Management Studio, execute the Cities/CountryPopulation index and observe the result when the Skip Transform option is checked and when it is not...

RavenDB stale indexes

RavenDB indexes can be stale. They are eventually consistent and eventually here usually means in under a second. When you query RavenDB to retrieve some data, it will return the data whether or not it has finished indexing this data in the background. RavenDB will let the user know if query results are stale, and can also be told to wait until non-stale results are available, this allows introducing new indexes on the fly. Live index rebuilds is a rare feature.

Note

Waiting for a non-stale index is not a recommended practice for production systems.

In RavenDB whenever new data is inserted or updated, a background process will perform data indexing. This might be useful to improve the server response time but in this case you may query stale indexes. In a lot of situations, a stale index isn't a problem, and as expressed on the RavenDB site:

Better stale than offline.

When you call the SaveChanges() method on the session object to persist changes on some objects, and try...

Time for action – checking for stale index results

You will add a new code snippet to the Main() method of the RavenDB_Ch04 to check if an index result is stale or not.

Open the RavenDB_Ch04 project in Visual Studio.
Add the following code to the Main() method:
Save all the files, build and run the solution.
Check the output window for the stale status of the index result.

What just happened?

You added the necessary code to the Main() method to check stale index result.

In order to perform stale index result checking, you first declare a new RavenQueryStatistics variable that will hold the query statistics information about the query and the index such as the IndexName, and the TotalResults which indicates the total query results documents count (line 190).

Then, we query the index using the Query() method on the Session instance object and specify the index name to check (line 191). To get back the query statistics, we call the Statistics() method (lines 192).

In this query you don't need to retrieve...

Time for action – explicitly waiting for a non-stale index result

You will add a new code snippet to the Main() method of the RavenDB_Ch04 project to tell RavenDB to wait for a non-stale result.

Open the RavenDB_Ch04 project in Visual Studio.
Add the following code to the Main() method:
Save all the files, build and run the solution.

What just happened?

You added the necessary code to the Main() method to instruct RavenDB to explicitly wait for a non-stale index result.

In order to do that, you call the Customize() method on the Query object and call the WaitForNonStaleResultsAsOfNow() function within a lambda expression. This function takes a TimeSpan parameter to specify the time-out waiting delay. In this code snippet, we had specified 5 seconds as the time-out delay.

Note

Waiting for a non-stale index result is for use only for testing and learning purposes. It is strongly discouraged in a production environment.

Have a go hero – display all index names

Add a new method to the Program class...

Summary

In this chapter, we have learned about RavenDB indexes, how they work, and their different types. We specifically covered RavenDB's dynamic and static indexes and how to query each type of these indexes using Linq.

Afterward, we continued to discover how RavenDB uses Map/Reduce in static indexes, and how you can best implement it to take advantage of this programming model.

Throughout this chapter, we manually created indexes using the .NET Client API and implemented Map/Reduce/TransformResults functions. Then we finished with a sample method to learn how to manage stale indexes.

In the next chapter, we will put our newly learned skills to work and use them to learn another preferred and recommended way to create static indexes in RavenDB. Keep reading!

The rest of the chapter is locked

You have been reading a chapter from

RavenDB 2.x Beginner's Guide

Published in: Sep 2013Publisher: PacktISBN-13: 9781783283798

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Khaled Tannir

Khaled Tannir has been working with computers since 1980. He began programming with the legendary Sinclair Zx81 and later with Commodore home computer products (Vic 20, Commodore 64, Commodore 128D, and Amiga 500). He has a Bachelor's degree in Electronics, a Master's degree in System Information Architectures, in which he graduated with a professional thesis, and completed his education with a Master of Research degree. He is a Microsoft Certified Solution Developer (MCSD) and has more than 20 years of technical experience leading the development and implementation of software solutions and giving technical presentations. He now works as an independent IT consultant and has worked as an infrastructure engineer, senior developer, and enterprise/solution architect for many companies in France and Canada. With significant experience in Microsoft .Net, Microsoft Server Systems, and Oracle Java technologies, he has extensive skills in online/offline applications design, system conversions, and multilingual applications in both domains: Internet and Desktops. He is always researching new technologies, learning about them, and looking for new adventures in France, North America, and the Middle-east. He owns an IT and electronics laboratory with many servers, monitors, open electronic boards such as Arduino, Netduino, RaspBerry Pi, and .Net Gadgeteer, and some smartphone devices based on Windows Phone, Android, and iOS operating systems. In 2012, he contributed to the EGC 2012 (International Complex Data Mining forum at Bordeaux University, France) and presented, in a workshop session, his work on "how to optimize data distribution in a cloud computing environment". This work aims to define an approach to optimize the use of data mining algorithms such as k-means and Apriori in a cloud computing environment. He is the author of RavenDB 2.x Beginner's Guide, Packt Publishing. He aims to get a PhD in Cloud Computing and Big Data and wants to learn more and more about these technologies. He enjoys taking landscape and night time photos, travelling, playing video games, creating funny electronic gadgets with Arduino/.Net Gadgeteer, and of course, spending time with his wife and family. You can reach him at contact@khaledtannir.net.
Read more about Khaled Tannir

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages