Reader small image

You're reading from  Mastering Apache Solr 7.x

Product typeBook
Published inFeb 2018
Reading LevelExpert
PublisherPackt
ISBN-139781788837385
Edition1st Edition
Languages
Tools
Right arrow
Authors (3):
Sandeep Nair
Sandeep Nair
author image
Sandeep Nair

Sandeep has been working in Liferay technology for more than 8 years and has more than 10 years' of overall experience in Java and Java EE technologies. He has executed projects using Liferay across various verticals such as construction, financial, and medical domains, providing solutions for collaboration, enterprise content management, and Web content Management systems. He has created a free and open source Google Chartlet plugin for Liferay which has been downloaded and used by people across 90 countries according to sourceforge statistics. Besides development, consulting, and implementing solutions he has also been involved in giving training on Liferay in other countries. Before he jumped into Liferay he had experience in Java and Java EE Technologies. He has authored "Liferay Beginner's Guide" and "Instant Liferay Portal 6 Starter" with Packt Publishing. When he is not coding, he loves to read books and travel.
Read more about Sandeep Nair

Chintan Mehta
Chintan Mehta
author image
Chintan Mehta

Chintan Mehta is a co-founder of KNOWARTH Technologies and heads the cloud/RIMS/DevOps team. He has rich, progressive experience in server administration of Linux, AWS Cloud, DevOps, RIMS, and on open source technologies. He is also an AWS Certified Solutions Architect. Chintan has authored MySQL 8 for Big Data, Mastering Apache Solr 7.x, MySQL 8 Administrator's Guide, and Hadoop Backup and Recovery Solutions. Also, he has reviewed Liferay Portal Performance Best Practices and Building Serverless Web Applications.
Read more about Chintan Mehta

Dharmesh Vasoya
Dharmesh Vasoya
author image
Dharmesh Vasoya

Dharmesh Vasoya is a Liferay 6.2 certified developer. He has 5.5 years of experience in application development with technologies such as Java, Liferay, Spring, Hibernate, Portlet, and JSF. He has successfully delivered projects in various domains, such as healthcare, collaboration, communication, and enterprise CMS, using Liferay. Dharmesh has good command of the configuration setup of servers such as Solr, Tomcat, JBOSS, and Apache Web Server. He has good experience of clustering, load balancing and performance tuning. He completed his MCA at Ahmedabad University.
Read more about Dharmesh Vasoya

View More author details
Right arrow

Chapter 8. Managing and Fine-Tuning Solr

Okay, so you have your brand new car up and running and have been using it judiciously day in and day out, but you don't maintain it properly from time to time! What will happen? Of course, the performance is going to deteriorate over a period of time. Another thing could be that your car supports automatic parking but you never found out how to override the default setting. In such a case, a manual comes in handy for learning and tweaking all the features that your car can provide. Similarly, you need to fine-tune and manage your Solr so as to get the most out of it. This is exactly what we are going to see in this chapter.

JVM configuration


One of the things that you need to take particular care of when you are working on any Java-based application is configuring the JVM optimally, and Solr is no exception.

Managing the memory heap 

Anyone who has worked with Java-based applications would have surely come across setting the heap space. We do it using -Xms and -Xmx. Suppose I set following the command-line option:

-Xms256m-Xmx2048m

Here, Xms specifies our initial memory allocation pool, whereas Xmx specifies the maximum memory allocation pool for JVM. In the case we just saw, our JVM will start with 256 MB of memory and will be able to use up to 2 GB of memory.

If we require more heap space, then we can increase -Xms. We can also decide not to give any initial heap space at all and let JVM use the heap space as per the need, but this may increase our startup time. Similarly, failing to set up the maximum heap size properly can result in OutOfMemoryException. Proper garbage collection JVM parameters should be set...

Managing solrconfig.xml


As we already know now, solrconfig.xml forms the heart of Solr when it comes to configuring Solr.

There are two ways in which this file is modified:

  • By making direct changes in solrconfig.xml
  • Using the config API to create configoverlay.json, which holds configuration overlays to modify the default values specified in solrconfig.xml

The solrconfig.xml file is used to configure the admin web interface. It can be used to change parameters for replication and duplication. We can change the request dispatcher too using solrconfig.xml. Various listeners and request handlers can be configured using solrconfig.xml.

Go to any of the conf directories for a collection and you will find solrconfig.xml inside. Navigate to SOLR_HOME/server/solr/configsets and you will see various configurations that follow best practices for configuring Solr.

Solr allows you to specify a variable for the property value, which can be replaced at runtime with the following syntax:

${propertyname[:default...

Managing backups


Going into production, we obviously need a proper backup and restore plan. The last thing we would want is for our hard disk to crash and all our index data to disappear or get corrupted.

Solr provides two ways to back up based on how you are running it:

  • Collections API in SolrCloud mode
  • Replication handler in standalone mode

Backup in SolrCloud

As mentioned earlier, using the collections API, we can take backups in SolrCloud. Doing so will ensure that the backups are generated across multiple shards; and then, at the time of restore, we use the same number of shards and replicas as the original collection. The commands are listed here:

Command name

Description

action=BACKUP

Used to back up Solr indexes and configuration

action=RESTORE

Used to restore Solr indexes and configuration

Standalone mode backups

In the case of standalone mode, backups and restoration are done using replication handler. The configuration of replication handler can be customized using our own replication handler...

JMX with Solr


Java Management Extensions (JMX) is a technology that was released in the J2SE 5.0 release; it provides tools for managing and monitoring resources dynamically at runtime. It is used in enterprise applications to make configurable systems and get the state of an enterprise application at any point of time. The resources are represented by managed beans (MBeans).

Solr can be controlled via the JMX interface; we can make use of VisualVM or JConsole to connect with Solr.

JMX configuration

Solr will automatically identify its location on startup if you have an MBean server running in Solr's JVM or if you start Solr with the Dcom.sun.management.jmxremote system property.

Alternatively, you can configure by defining a metrics reporter.

On a remote Solr server, if you need to do JMX-enabled Java profiling, then you have to enable remote JMX access when starting the Solr server.

Open solr.in.cmd or solr.in.sh in the SOLR_HOME/bin directory and set the ENABLE_REMOTE_JMX_OPTS property to true...

Logging configuration


Setting up logs is a key part of any enterprise application and Solr is no exception. Luckily, Solr provides many different ways to tweak the default logging configuration.

Log settings using the admin web interface

Using Solr's admin web interface, we can set various log levels. Go to the admin interface by typing the following URL:

http://localhost:8983/solr/

You should see the following admin screen:

You will see that on the left-hand side, there is a Logging option. Click on it and there will be a submenu item called Level, which will open up the following screen:

Here, we can set the logging level for many different log categories in a hierarchical order. For example, let's say I want to set org.apache.http.conn.ssl to log level and set all the subcategories under it to debug level; I will click on the edit icon next to ssl, as shown here:

This will open up a small popup with various log levels that we can set.

Note

Any log level set in this manner will be lost during the...

SolrCloud overview


One of the must have when going to production is clustering for fault tolerance and high availability. Solr's answer to this is SolrCloud, which provides ways to have distributed indexing and search capabilities with central configuration for the entire cluster, and load balancing with failover support.

As mentioned earlier, Solr provides distributed searching. Behind the scenes, Solr makes use of ZooKeeper to manage nodes. 

In SolrCloud, data is distributed in multiple shards, which can be hosted on multiple boxes having replicas; this provides redundancy, fault tolerance, and scalability. ZooKeeper holds the strings to manage the shards and replication and to decide which server will handle a specific request.

SolrCloud in interactive mode

Let's set up SolrCloud. Go to the SOLR_HOME/bin directory and start the server in interactive mode using the following command: 

solr -e cloud

As you can see, an interactive session starts up, asking you how many nodes the cluster should...

Enabling SSL – Solr security


In this example, we will see a basic SSL setup using a self-signed certificate. Enabling SSL ensures that communication between the client and Solr server is encrypted.

Prerequisites

Before generating a self-signed certificate, ensure that you have OpenSSL installed on your machine. To check whether OpenSSL is already installed, type the following command in the Command Prompt:

openssl version

It should print out the current version of OpenSSL running on your system. If it does not do so, kindly download the latest version of OpenSSL for your operating system and then install it.

We will also make use of JDK's keytool for generating self-signed certificates.

Generating a key and self-signed certificate

JDK provides the keytool command to create self-signed certificates. What we will first do is create a keystore using the following command:

keytool -genkeypair -alias mysolr -keyalg RSA -keysize 2048 -keypass solrpass -storepass solrpass -validity 3650 -keystore mysolrkeystore...

Performance statistics


In order to measure performance, Solr provides statistics and metrics; they can read either using Metrics API or by enabling JMX.

Statistics for request handlers

Both search and update request handlers provide various statistics.

The API request path for search is http://localhost:8983/solr/admin/metrics?group=mycore&prefix=QUERY./select.

Similarly the API request path for update is http://localhost:8983/solr/admin/metrics?group=mycore&prefix=UPDATE./update.

There are various attributes that can be added at the end of both of these URLs to get various statistics, as listed here:

  • 5minRate: Used to find out the requests per second that have we received in the last 5 minutes.
  • 15minRate: Same as 5minRate, but here we check for requests per second in the last 15 minutes.
  • p75_ms/p95_ms/p99_ms/p999_ms: Each of the four attributes represent how much processing time x percentile of the request took. x is to be replaced by the number specified.
  • count: Number of requests made...

Summary


In this chapter, we saw the various tuning parameters needed to take Solr to production. We started off with JVM parameters, and then saw how to manage solrconfig.xml. We got an understanding of taking backups, setting up JMX, and configuring logs. Finally, we had an overview of SolrCloud.

In the next chapter, we will see various Client APIs made available by Solr.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Apache Solr 7.x
Published in: Feb 2018Publisher: PacktISBN-13: 9781788837385
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Sandeep Nair

Sandeep has been working in Liferay technology for more than 8 years and has more than 10 years' of overall experience in Java and Java EE technologies. He has executed projects using Liferay across various verticals such as construction, financial, and medical domains, providing solutions for collaboration, enterprise content management, and Web content Management systems. He has created a free and open source Google Chartlet plugin for Liferay which has been downloaded and used by people across 90 countries according to sourceforge statistics. Besides development, consulting, and implementing solutions he has also been involved in giving training on Liferay in other countries. Before he jumped into Liferay he had experience in Java and Java EE Technologies. He has authored "Liferay Beginner's Guide" and "Instant Liferay Portal 6 Starter" with Packt Publishing. When he is not coding, he loves to read books and travel.
Read more about Sandeep Nair

author image
Chintan Mehta

Chintan Mehta is a co-founder of KNOWARTH Technologies and heads the cloud/RIMS/DevOps team. He has rich, progressive experience in server administration of Linux, AWS Cloud, DevOps, RIMS, and on open source technologies. He is also an AWS Certified Solutions Architect. Chintan has authored MySQL 8 for Big Data, Mastering Apache Solr 7.x, MySQL 8 Administrator's Guide, and Hadoop Backup and Recovery Solutions. Also, he has reviewed Liferay Portal Performance Best Practices and Building Serverless Web Applications.
Read more about Chintan Mehta

author image
Dharmesh Vasoya

Dharmesh Vasoya is a Liferay 6.2 certified developer. He has 5.5 years of experience in application development with technologies such as Java, Liferay, Spring, Hibernate, Portlet, and JSF. He has successfully delivered projects in various domains, such as healthcare, collaboration, communication, and enterprise CMS, using Liferay. Dharmesh has good command of the configuration setup of servers such as Solr, Tomcat, JBOSS, and Apache Web Server. He has good experience of clustering, load balancing and performance tuning. He completed his MCA at Ahmedabad University.
Read more about Dharmesh Vasoya