Reader small image

You're reading from  Apache Solr High Performance

Product typeBook
Published inMar 2014
Reading LevelIntermediate
Publisher
ISBN-139781782164821
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Surendra Mohan
Surendra Mohan
author image
Surendra Mohan

Surendra Mohan, who has served a few top-notch software organizations in varied roles, is currently a freelance software consultant. He has been working on various cutting-edge technologies like Drupal, Moodle, Apache Solr, ElasticSearch, Node.js, SoapUI, and so on for the past 10 years. He also delivers technical talks at various community events like Drupal Meetups and Drupal Camps. To find out more about him, his write-ups, technical blogs, and much more, go to http://www.surendramohan.info/. He has also written the books Administrating Solr and Apache Solr High Performance published by Packt Publishing and has reviewed other technical books such as Drupal 7 Multi Site Configuration and Drupal Search Engine Optimization, as well as titles on Drupal commerce, ElasticSearch, Drupal related video tutorials, titles on OpsView, and many more. Additionally, he writes technical blogs and articles with SitePoint.com. His published blogs and articles can be found at http://www.sitepoint.com/author/smohan/.
Read more about Surendra Mohan

Right arrow

Chapter 6. Performance Optimization with ZooKeeper

In this chapter, we will learn about ZooKeeper and will discuss how to set up, configure, and deploy ZooKeeper in an intention to optimize our Solr's performance. We will also discuss the various applications of ZooKeeper. We will cover the following topics:

  • Introduction to ZooKeeper

  • Setting up, configuring, and deploying ZooKeeper

  • Applications of ZooKeeper

So, let us get started.

Getting familiar with ZooKeeper


Let us start with understanding the background of ZooKeeper. When you think of implementing a distributed system across Solr servers and shards, ZooKeeper becomes a mandatory tool.

Prerequisites for a distributed server

In order to design a distributed system, you basically think of designing and developing the following coordination services:

  • Name service: It is a service that maps an entity to some other information associated to that entity. Assuming that we have an e-commerce online portal named eStore consisting of Piano XYZ as one of the products, the name service eStore will map Piano XYZ with its other information such as its SKU. In terms of infrastructure management, it is as good as a domain name being mapped to its respective IP address using the DNS service. Since you are going to play around with multiple servers while implementing the distributed system, you should be keeping an eye on which servers and services are currently running and monitor...

Setting up, configuring, and deploying ZooKeeper


By now, we know what ZooKeeper is, its architecture, and how it works. It's time to learn how to set up, configure, and deploy our ZooKeeper ensemble, and we will learn how to do it in this section.

For demonstration purposes, we will use ZooKeeper Version 3.4.5, which is the latest version of ZooKeeper at the time of writing. Moreover, we have considered using three ZooKeeper nodes with the names znode1.smohan.dom, znode2.smohan.dom, and znode3.smohan.dom. So let us get started by following the proceeding steps on each node.

Setting up ZooKeeper

If you don't have JDK installed, download and install it. We recommend that you refer to Chapter 1, Installing Solr, to learn how to install it. Additionally, JDK is required as the ZooKeeper server runs on JVM. We can set up ZooKeeper by performing the following steps:

  1. Download ZooKeeper-3.4.5.tar.gz and untar it to an appropriate location using the following command:

    wget http://supergsego.com/apache...

Applications of ZooKeeper


Due to its versatile role in a distributed system, ZooKeeper has a huge set of practical applications already in the market. We will list a few of them here in this section as follows:

  • Apache Solr: It uses ZooKeeper to elect the leader (that is, the leader election process) and centralize the configuration

  • Apache Hadoop: It seeks the help of ZooKeeper to automatically recover from Hadoop HDFS Namenode failure, thereby providing high availability of YARN ResourceManager

  • Apache Accumulo: It is a sorted distributed key-value store that is built on top of Apache ZooKeeper and Apache Hadoop

  • Apache HBase: It is a distributed database that is built on Hadoop, ZooKeeper facilitates it with master election, lease management, and communication among servers

  • Apache Mesos: It is used to manage clusters and provides effective resource sharing and isolation across distributed applications. ZooKeeper helps Mesos in facilitating a replicating master that is fault tolerant

  • Cloudera...

Summary


In this chapter, we learned how to use ZooKeeper for performance optimization purposes, and we covered how to set up, configure, and deploy ZooKeeper. We also learned about the different applications of ZooKeeper that can help us optimize Solr's performance.

In the next chapter, we will list down some useful and necessary references to the official and documentation pages that will help you to explore the topics and concepts even further. It also covers the recommended books and video tutorials that will facilitate you to enhance your learning curve.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Apache Solr High Performance
Published in: Mar 2014Publisher: ISBN-13: 9781782164821
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Surendra Mohan

Surendra Mohan, who has served a few top-notch software organizations in varied roles, is currently a freelance software consultant. He has been working on various cutting-edge technologies like Drupal, Moodle, Apache Solr, ElasticSearch, Node.js, SoapUI, and so on for the past 10 years. He also delivers technical talks at various community events like Drupal Meetups and Drupal Camps. To find out more about him, his write-ups, technical blogs, and much more, go to http://www.surendramohan.info/. He has also written the books Administrating Solr and Apache Solr High Performance published by Packt Publishing and has reviewed other technical books such as Drupal 7 Multi Site Configuration and Drupal Search Engine Optimization, as well as titles on Drupal commerce, ElasticSearch, Drupal related video tutorials, titles on OpsView, and many more. Additionally, he writes technical blogs and articles with SitePoint.com. His published blogs and articles can be found at http://www.sitepoint.com/author/smohan/.
Read more about Surendra Mohan