Reader small image

You're reading from  ElasticSearch Cookbook

Product typeBook
Published inDec 2013
Reading LevelBeginner
PublisherPackt
ISBN-139781782166627
Edition1st Edition
Languages
Right arrow
Author (1)
Alberto Paro
Alberto Paro
author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Right arrow

Chapter 10. Java Integration

In this chapter, we will cover the following topics:

  • Creating an HTTP client

  • Creating a native client

  • Managing indices with the native client

  • Managing mappings

  • Managing documents

  • Managing bulk action

  • Creating a query

  • Executing a standard search

  • Executing a facet search

  • Executing a scroll/scan search

Introduction


ElasticSearch functionalities can be easily integrated in every Java application in several ways, both via REST API then native ones.

With the use of Java, it's easy to call a REST HTTP interface with one of the many libraries available, such as Apache HTTPComponents Client (http://hc.apache.org/). In this field, there is no library which is used the most; typically developers choose the library that best suits their taste or that they know very well.

Every JVM language can also use the Native protocol to integrate ElasticSearch in their products. The Native protocol, discussed in Chapter 1, Getting Started, is one of the fastest protocols available to communicate with ElasticSearch due to many factors, such as its binary nature, the fast native serializer/deserializer of the data, the asynchronous approach for communicating and the hop reduction (native client is able to communicate directly with the node that contains the data without executing a double hop needed in REST calls...

Creating an HTTP client


An HTTP Client is one of the easiest clients to create. It's very handy because it allows calling not only the internal methods as the Native protocol does, but also the third-party calls implemented in plugins that can be called only via HTTP.

Getting ready

You need a working ElasticSearch cluster and Maven installed. The code of this recipe is in the chapter_10/http_client directory present in the code bundle available on Packt's website.

How to do it...

For creating an HTTP client, we will perform the steps given as follows:

  1. For these examples, we have chosen the Apache HttpComponents that is one of the most famous libraries to execute HTTP calls. This library is available in the main Maven repository search.Maven.org. To enable the compilation in your Maven pom.xml project, just add:

    <dependency>
      <groupId>org.apache.httpcomponents</groupId>
      <artifactId>httpclient</artifactId>
      <version>4.3</version>
    </dependency>
  2. If...

Creating a native client


To create a native client to communicate with an ElasticSearch server, there are two ways:

  • Creating an embedded node (a node that doesn't contain data, but it works as arbiter) and getting the client from it. This node will appear in the cluster state nodes and it's able to use discovery capabilities of ElasticSearch to join the cluster (so no node address is required to connect to a cluster). This client is able to reduce the node routing due to knowledge of cluster topology.

  • Creating a transport client, which is a standard client that requires the address and port of nodes to connect.

In this recipe, we will see how to create these clients.

Getting ready

You need a working ElasticSearch cluster and a working copy of Maven.

The code of this recipe is in chapter_10/nativeclient in the code bundle of this book provided on Packt's website.

How to do it...

To create a native client, we will perform the steps given as follows:

  1. Before starting, we must be sure that Maven loads...

Managing indices with the native client


In the previous recipe we have seen how to initialize a client to send calls to an ElasticSearch cluster. In this recipe, we will see how to manage indices via client calls.

Getting ready

You need a working ElasticSearch cluster and a working copy of Maven.

The code of this recipe is in chapter_10/nativeclient in the code bundle, which can be downloaded from Packt's website, and the referred class is IndicesOperations.

How to do it...

ElasticSearch client maps all indices operations under the admin.indices object of the client. Here, there are all the indices operation (create, delete, exists, open, close, optimize, and so on). In the following example, we will only see the most used calls on indices.

The following code retrieves a client and executes the main operation on indices:

import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse;
import org.elasticsearch.client.Client;

public class IndicesOperations {
  private final Client...

Managing mappings


After creating an Index the next step is to add some mapping to it. We have already seen how to put a mapping via REST API in Chapter 4, Standard Operations. In this recipe, we will see how to manage mappings via native client.

Getting ready

You need a working ElasticSearch cluster and a working copy of Maven.

The code of this recipe is in chapter_10/nativeclient in the code bundle of this book, available on Packt's website, and the referred class is MappingsOperations.

How to do it...

In the following code, we add a mytype mapping to a myindex via native client:

importorg.elasticsearch.action.admin.indices.mapping.put.PutMappingResponse;
import org.elasticsearch.client.Client;
import org.elasticsearch.common.xcontent.XContentBuilder;

import java.io.IOException;

import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;

public class MappingOperations {

  public static void main( String[] args )
  {
    String index="mytest";
    String type="mytype";
  ...

Managing documents


The native APIs for managing document (index, delete, and update) are the most important after the search ones. In this recipe, we will see how to use them. In the next one we will evolve in executing bulk actions to improve performances.

Getting ready

You need a working ElasticSearch cluster and a working copy of Maven.

The code of this recipe is in chapter_10/nativeclient in the code bundle of this book available on Packt's website, and the referred class is DocumentOperations.

How to do it...

For managing documents, we will perform the steps given as follows:

  1. We'll execute all the document with CRUD operations (CReate, Update, Delete) via native client:

    import org.elasticsearch.action.delete.DeleteResponse;
    import org.elasticsearch.action.get.GetResponse;
    import org.elasticsearch.action.index.IndexResponse;
    import org.elasticsearch.action.update.UpdateResponse;
    import org.elasticsearch.client.Client;
    import org.elasticsearch.common.xcontent.XContentFactory;
    
    import java.io...

Managing bulk action


Executing atomic operation on items via single call is often a bottleneck if you need to index or delete thousands/millions of records: the best practice in this case is to execute a bulk action. We discussed bulk action via REST API in the Speeding up atomic operations (bulk) recipe in Chapter 4, Standard Operations.

Getting ready

You need a working ElasticSearch cluster and a working copy of Maven.

The code of this recipe is in chapter_10/nativeclient in the code bundle of this book available on Packt's website and the referred class is BulkOperations.

How to do it...

For managing a bulk action, we will perform the steps given as follows:

  1. We'll execute a bulk action adding 1000 elements, updating them and deleting them:

    import org.elasticsearch.action.bulk.BulkRequestBuilder;
    import org.elasticsearch.client.Client;
    import org.elasticsearch.common.xcontent.XContentFactory;
    
    import java.io.IOException;
    
    public class BulkOperations {
      public static void main( String[] args...

Creating a query


Before search, a query must be built: ElasticSearch provides several ways to build these queries. In this recipe, will see how to create a query object via QueryBuilder and via simple strings.

Getting ready

You need a working ElasticSearch cluster and a working copy of Maven. The code of this recipe is in chapter_10/nativeclient in the code bundle of this book available on Packt's website and the referred class is QueryCreation.

How to do it...

For creating a query, we will perform the steps given as follows:

  1. There are several ways to define a query in ElasticSearch; they are interoperable.

    Generally a query can be defined as a:

    • QueryBuilder: This is a helper to build a query.

    • XContentBuilder: This is a helper to create JSON code. We discussed it in the Managing mapping recipe in this chapter. The JSON code to be generated is similar to the previous REST, but converted in programmatic code.

    • Array of bytes or string: In this case, it's usually the JSON to be executed as we have...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
ElasticSearch Cookbook
Published in: Dec 2013Publisher: PacktISBN-13: 9781782166627
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro