Reader small image

You're reading from  Mastering Elastic Stack

Product typeBook
Published inFeb 2017
PublisherPackt
ISBN-139781786460011
Edition1st Edition
Right arrow
Authors (2):
Ravi Kumar Gupta
Ravi Kumar Gupta
author image
Ravi Kumar Gupta

Ravi Kumar Gupta is an author, reviewer, and open source software evangelist. He pursued an MS degree in software system at BITS Pilani and a B.Tech at LNMIIT, Jaipur. His technological forte is portal management and development. He is currently working with Azilen Technologies, where he acts as a Technical Architect and Project Manager. His previous assignment was as a lead consultant with CIGNEX Datamatics. He was a core member of the open source group at TCS, where he started working on Liferay and other UI technologies. During his career, he has been involved in building enterprise solutions using the latest technologies with rich user interfaces and open source tools. He loves to spend time writing, learning, and discussing new technologies. His interest in search engines and that small project on crawler during college time made him a technology lover. He is one of the authors of Test-Driven JavaScript Development, Packt Publishing. He is an active member of the Liferay forum. He also writes technical articles for his blog at TechD of Computer World (http://techdc.blogspot.in). He has been a Liferay trainer at TCS and CIGNEX, where he has provided training on Liferay 5.x and 6.x versions. He was also a reviewer for Learning Bootstrap, Packt Publishing. He can be reached on Skype at kravigupta, on Twitter at @kravigupta, and on LinkedIn at https://in.linkedin.com/in/kravigupta.
Read more about Ravi Kumar Gupta

Yuvraj Gupta
Yuvraj Gupta
author image
Yuvraj Gupta

Yuvraj Gupta is an author and a keen technologist with interest towards Big Data, Data Analytics, Data Visualization, and Cloud Computing. He has been working as a Big Data Consultant primarily in domain of Big Data Testing. He loves to spend time writing on various social platforms. He is an avid gadget lover, a foodie, a sports enthusiast and love to watch tv-series or movies. He always keep himself updated with the latest happenings in technology. He has authored a book titled Kibana Essentials with Packt Publishers. He can be reached at gupta.yuvraj@gmail.com or at LinkedIn www.linkedin.com/in/guptayuvraj.
Read more about Yuvraj Gupta

View More author details
Right arrow

Chapter 8. Elasticsearch APIs

In Chapter 2, Step into Elasticsearch, we learned about underlying technology and how Elasticsearch works and the APIs it offers. In Chapter 4, Kibana Interface, we understood how to use the Console and got to use aggregations using Kibana.

This chapter will complete the rest of the APIs and we will use Console to send API requests to Elasticsearch. We'll cover the following topics in this chapter:

  • Cluster APIs

  • Cat APIs

  • Modules

  • Ingest nodes

  • Elasticsearch clients

  • Java APIs

For a quick go-through of Console, you can refer to the Exploring dev tools section in Chapter 4, Kibana Interface.

Note

Assuming for development and learning purpose, Kibana is installed locally.

The cluster APIs


These APIs allow us to know about cluster state, health, statistics, node statistics, node information, and so on.

Cluster health

To know cluster health, we can use the _cluster/health endpoint, as shown in the following example:

GET /_cluster/health/library

Here, GET is the verb, _cluster/health is the endpoint, and library is the index. This call will result in information about nodes, data nodes, shards, tasks, and status of the index in case the index was specified otherwise for the cluster:

The results on the Response pane show the status of the Elasticsearch cluster as green with other values such as node information, shards, and pending tasks.

Let's have a look at what other values of the Elasticsearch status denote:

  • Red: Some or all of the primary shards are not allocated or ready

  • Yellow: All primary shards are allocated, but some or none of the replica shards are allocated

  • Green: All primary and replica shards are allocated and the cluster is fully up

Tip

If replicas...

The cat APIs


This API helps us to print information nodes, indices, fields, tasks, and plugins in a human-readable format rather than a JSON. It can also be visualized to see how tables are printed on the console.

All these commands can be used with the GET verb of curl. By default, the commands will list only data and no headers. To print headers, we can use v in query parameters:

GET /_cat/health?v

The preceding command can be used instead of the following:

GET /_cat/health

We can also specify which headers to show by supplying the comma-separated values for the h query parameter.

Let's see the endpoints available to operate on:

  • _cat/indices: This shows data about indices such as health, status, index name, primary, replicas, documents count, and memory:

            GET /_cat/indices?v&h=health,status,index,docs.count,store
    

Here is what it looks like in the Console:

As we can see, it shows health and stats for each index.

  • _cat/master: This shows the node ID, IP address, and node name of the...

Elasticsearch modules


Every great project has a number of modules to support what it offers. Elasticsearch has many such modules. These modules need some settings, either static using elasticsearch.yml or dynamic settings, which can be updated using the cluster API. Let's look at different modules.

Cluster module

The cluster module decides how the shards are allocated to nodes and takes care of the movement of shards in order to keep the cluster balanced. This process is known as shard allocation. There are a number of settings for this module, which can be dynamically applied using the cluster API. These settings take care of shard allocation among nodes in a cluster as well as within a node.

Discovery module

The discovery module helps to discover the nodes in the network for a specified cluster. In the elasticsearch.yml configuration file, there is one configuration for the cluster name, which decides which cluster this node will be part of. The default name is elasticsearch:

cluster.name ...

Ingest nodes


As we learnt in the previous section, ingest nodes help to preprocess things before a document is indexed. Before a bulk request or index operations, the ingest node intercepts the request and does required processing on the document. An example of such a processor can be the date processor, which is used to parse the dates in fields. Another example is a convert processor, which converts a field value to a target type, for example, string to integer. A number of processors are available at: https://www.elastic.co/guide/en/elasticsearch/reference/5.1/ingest-processors.html.

These kinds of nodes are helpful when a huge processing happens and we do not want a data node or master node to engage in processing. Dedicated ingest nodes can help to reduce the load significantly. It is best to set node.ingest as false for data and master nodes.

To understand how ingest nodes work with pipelines, let's follow these steps. Let's take the example of our library index and type movies added...

Elasticsearch clients


Earlier in this chapter, we got to know that Elasticsearch nodes support both transport and HTTP protocols. Using this flexibility, Elasticsearch nodes can be managed by its client written in other programming languages. There are a number of clients to perform operations on cluster and nodes. These clients can connect to a node or cluster to manage indices, operations on documents, and make searches.

Supported clients

A few clients are supported officially by the Elasticsearch organization. These clients are basically APIs that you can utilize with your own applications written in respective programming languages. For example, if you are developing a Java web application that integrates to Elasticsearch and you want to offer managing indices through an admin panel, you can use the Java API supported by Elasticsearch to connect to the cluster and nodes and do the necessary operations. The following is a list of all supported clients by Elasticsearch:

  • Java API

  • JavaScript...

Java API


The Java client uses the transport layer for its operations and supports all kinds of operations. We can make searches, index documents, delete, or get documents including admin tasks on the cluster. We can also perform operations in bulk.

To use the Java API in our application we need to use a few JAR files as the dependency. For a maven project, we can add dependency in our pom.xml as follows:

<dependency> 
        <groupId>org.elasticsearch</groupId> 
        <artifactId>elasticsearch</artifactId> 
        <version>${elasticsearch.version}</version> 
</dependency> 

To include the jar files directly to the project, we can also download from the repository here https://repo.maven.apache.org/maven2/org/elasticsearch/elasticsearch. We can select the version we want for our application.

One thing to note here is that the client version should be the same as the version of Elasticsearch being used. For example, if...

Elasticsearch plugins


As learned in Chapter 7, Customizing Elastic Stack, under the Extending Elasticsearch section, earlier versions (before 5.x) of Elasticsearch offered a number of plugins and these plugins were divided into three types - Java, Site, and Mixed plugins. Now Site and Mixed plugins are deprecated and only Java plugins are supported. These Java plugins must be installed on every node and contain only JAR files. Chapter 7, Customizing Elastic Stack, also talks about Elasticsearch plugins.

Elastic.co categorizes plugins as core plugins, which are developed and maintained officially, and community plugins, which are developed and maintained by a community. To utilize these plugins, we need to install into Elasticsearch by using the Elasticsearch-plugin utility. Core plugins are released with Elasticsearch, and share the same version as Elasticsearch.

In this section, we will get familiar with a few of the interesting plugins. Core plugins can be installed just by using the name...

Summary


This chapter concludes the Elasticsearch APIs. Being a very vast topic, not all of the APIs can be covered, but we have got the gist of how these APIs work and help us to manage the Elasticsearch cluster, nodes, and indices or even make a search for documents. When you are working with Kibana, the same things can be done using the Console. There are many REST-based clients developed for Elasticsearch for numerous languages and platforms that use http protocol and we have been learning that since Chapter 2, Stepping into Elasticsearch. This chapter also covered the other side of the story - using Transport Client with the help of the Java API.

The next chapter is going to focus on the customization of Elastic Stack using plugins. Plugins give us a good amount of control on the functionalities and we get a liberty to implement what is not present or mend what is present to make it work for us. We will be learning the way we can create new plugins and customize the stack.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Elastic Stack
Published in: Feb 2017Publisher: PacktISBN-13: 9781786460011
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Ravi Kumar Gupta

Ravi Kumar Gupta is an author, reviewer, and open source software evangelist. He pursued an MS degree in software system at BITS Pilani and a B.Tech at LNMIIT, Jaipur. His technological forte is portal management and development. He is currently working with Azilen Technologies, where he acts as a Technical Architect and Project Manager. His previous assignment was as a lead consultant with CIGNEX Datamatics. He was a core member of the open source group at TCS, where he started working on Liferay and other UI technologies. During his career, he has been involved in building enterprise solutions using the latest technologies with rich user interfaces and open source tools. He loves to spend time writing, learning, and discussing new technologies. His interest in search engines and that small project on crawler during college time made him a technology lover. He is one of the authors of Test-Driven JavaScript Development, Packt Publishing. He is an active member of the Liferay forum. He also writes technical articles for his blog at TechD of Computer World (http://techdc.blogspot.in). He has been a Liferay trainer at TCS and CIGNEX, where he has provided training on Liferay 5.x and 6.x versions. He was also a reviewer for Learning Bootstrap, Packt Publishing. He can be reached on Skype at kravigupta, on Twitter at @kravigupta, and on LinkedIn at https://in.linkedin.com/in/kravigupta.
Read more about Ravi Kumar Gupta

author image
Yuvraj Gupta

Yuvraj Gupta is an author and a keen technologist with interest towards Big Data, Data Analytics, Data Visualization, and Cloud Computing. He has been working as a Big Data Consultant primarily in domain of Big Data Testing. He loves to spend time writing on various social platforms. He is an avid gadget lover, a foodie, a sports enthusiast and love to watch tv-series or movies. He always keep himself updated with the latest happenings in technology. He has authored a book titled Kibana Essentials with Packt Publishers. He can be reached at gupta.yuvraj@gmail.com or at LinkedIn www.linkedin.com/in/guptayuvraj.
Read more about Yuvraj Gupta