Reader small image

You're reading from  Solr Cookbook - Third Edition

Product typeBook
Published inJan 2015
Reading LevelIntermediate
Publisher
ISBN-139781783553150
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Rafal Kuc
Rafal Kuc
author image
Rafal Kuc

Rafał Kuć is a software engineer, trainer, speaker and consultant. He is working as a consultant and software engineer at Sematext Group Inc. where he concentrates on open source technologies such as Apache Lucene, Solr, and Elasticsearch. He has more than 14 years of experience in various software domains—from banking software to e–commerce products. He is mainly focused on Java; however, he is open to every tool and programming language that might help him to achieve his goals easily and quickly. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people solve their Solr and Lucene problems. He is also a speaker at various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, ApacheCon, Lucene/Solr Revolution, Velocity, and DevOps Days. Rafał began his journey with Lucene in 2002; however, it wasn't love at first sight. When he came back to Lucene in late 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and that was it. He started working with Elasticsearch in the middle of 2010. At present, Lucene, Solr, Elasticsearch, and information retrieval are his main areas of interest. Rafał is also the author of the Solr Cookbook series, ElasticSearch Server and its second edition, and the first and second editions of Mastering ElasticSearch, all published by Packt Publishing.
Read more about Rafal Kuc

Right arrow

Limiting I/O usage


As you might know, the Lucene index is divided into smaller pieces called segments, and each segment is stored on disk. Depending on the indexing and merge policy settings, Lucene, from time to time, merges two or more segments into a new one. This operation requires reading the old segments and writing a new one with the information from the old segments. The merges can happen at the same time when Solr indexes data and queries are run. The same goes for writing the segments; it can be pretty expensive when it comes to I/O usage. It is because of this that Solr allows us to configure the limits for I/O usage. This recipe will show you how to do this.

Getting ready

Before continuing further with this recipe, read the Choosing the proper directory configuration recipe of this chapter to see what directories are available and how to configure them.

How to do it...

Let's assume that we want to limit the I/O usage for our use case that uses solr.MMapDirectoryFactory. So, in the solrconfig.xml file, we will have the following configuration present:

<directoryFactory name="DirectoryFactory" class="solr.MMapDirectoryFactory">
</directoryFactory>

Now, let's introduce the following limits:

  • We allow Solr to write a maximum of 20 MB per second during segment writes

  • We allow Solr to write a maximum of 10 MB per second during segment merges

  • We allow Solr to read a maximum of 50 MB per second

To do this, we change our previous configuration to the following:

<directoryFactory name="DirectoryFactory" class="solr.MMapDirectoryFactory">
 <double name="maxWriteMBPerSecFlush">20</double>
 <double name="maxWriteMBPerSecMerge">10</double>
 <double name="maxWriteMBPerSecRead">50</double>
</directoryFactory>

After altering the configuration, all we need to do is restart Solr and the limits will be taken into consideration.

How it works...

The logic behind setting the limits is very simple. All directories that extend the Solr CachingDirectoryFactory class allow us to set the maxWriteMBPerSecFlush, maxWriteMBPerSecMerge and maxWriteMBPerSecRead properties. The mentioned directory implementations are all the directory implementations that were mentioned in the Choosing the proper directory configuration recipe of this chapter.

The maxWriteMBPerSecFlush property allows us to tell Solr how many megabytes per second can be written by Solr during segment flush (so, during the write operation that is not triggered by segment merging). The maxWriteMBPerSecMerge property allows us to specify how many megabytes per second can be written by Solr during segment merge. Finally, the maxWriteMBPerSecRead property specifies the amount of megabytes allowed to be read per second. One thing to remember is that the values are approximated, not exact.

Limiting I/O usage can be very handy, especially in deployments where I/O usage is at its maximum. During query peak hours, when we want to solve server queries as fast as we can, we need to minimize the indexing and merging impact. With proper configuration that is adjusted to our needs, we can just limit the I/O usage and still serve queries with the latency we want.

Previous PageNext Page
You have been reading a chapter from
Solr Cookbook - Third Edition
Published in: Jan 2015Publisher: ISBN-13: 9781783553150
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rafal Kuc

Rafał Kuć is a software engineer, trainer, speaker and consultant. He is working as a consultant and software engineer at Sematext Group Inc. where he concentrates on open source technologies such as Apache Lucene, Solr, and Elasticsearch. He has more than 14 years of experience in various software domains—from banking software to e–commerce products. He is mainly focused on Java; however, he is open to every tool and programming language that might help him to achieve his goals easily and quickly. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people solve their Solr and Lucene problems. He is also a speaker at various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, ApacheCon, Lucene/Solr Revolution, Velocity, and DevOps Days. Rafał began his journey with Lucene in 2002; however, it wasn't love at first sight. When he came back to Lucene in late 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and that was it. He started working with Elasticsearch in the middle of 2010. At present, Lucene, Solr, Elasticsearch, and information retrieval are his main areas of interest. Rafał is also the author of the Solr Cookbook series, ElasticSearch Server and its second edition, and the first and second editions of Mastering ElasticSearch, all published by Packt Publishing.
Read more about Rafal Kuc