ElasticSearch Cookbook - Second Edition

More Information
Learn
  • Make ElasticSearch work for you by choosing the best cloud topology and powering it with plugins
  • Develop tailored mapping to take full control of index steps
  • Build complex queries through managing indices and documents
  • Optimize search results through executing analytics aggregations
  • Manage rivers (SQL, NoSQL, and web-based) to synchronize and populate cross-source data
  • Develop web interfaces to execute key tasks
  • Monitor the performance of the cluster and nodes
About

This book will guide you through the complete ElasticSearch ecosystem. From choosing the correct transport layer and communicating with the server to creating and customizing internal actions, you will develop an in-depth knowledge of the implementation of the ElasticSearch architecture.

After creating complex queries and analytics, mapping, aggregation, and scripting, you will master the integration of ElasticSearch's functionality in user-facing applications and take your knowledge one-step further by building custom plugins, developing tailored mapping, executing powerful analytics, and integrating with Python and Java applications.

 

 

Read an Extract from the book

Communicating with ElasticSearch

You can communicate with several protocols using your ElasticSearch server. In this recipe,
we will take a look at the main protocols.

Getting ready

You will need a working instance of the ElasticSearch cluster.

How it works...

ElasticSearch is designed to be used as a RESTful server, so the main protocol is the HTTP, usually on port number 9200 and above. Thus, it allows using different protocols such as native and thrift ones.

Many others are available as extension plugins, but they are seldom used, such as memcached, couchbase, and websocket. (If you need to find more on the transport layer, simply type in Elasticsearch transport on the GitHub website to search.)

Every protocol has advantages and disadvantages. It's important to choose the correct one depending on the kind of applications you are developing. If you are in doubt, choose the HTTP Protocol layer that is the standard protocol and is easy to use.

Choosing the right protocol depends on several factors, mainly architectural and performance related. This schema factorizes advantages and disadvantages related to them. If you are using any of the protocols to communicate with ElasticSearch official clients, switching from a protocol to another is generally a simple setting in the client initialization.

Protocol

Advantages

Disadvantages

Type

HTTP

  • Frequently used
  • API is safe and has general compatibility for different versions of ES, although JSON is suggested
  • HTTP overhead
  • Text

Native

  • Fast network layer
  • Programmatic
  • Best for massive indexing operations
  • If the API changes, it can break the applications
  • Requires the same version of the ES server
  • Only on JVM
  • Binary

Thrift

  • Similar to HTTP
  • Related to the Thrift plugin
  • Binary

 

 
Features
  • Deploy and manage simple ElasticSearch nodes as well as complex cluster topologies
  • Write native plugins to extend the functionalities of ElasticSearch to boost your business
  • Packed with clear, step-by-step recipes to walk you through the capabilities of ElasticSearch
Page Count 472
Course Length 14 hours 9 minutes
ISBN 9781783554836
Date Of Publication 28 Jan 2015

Authors

Alberto Paro

Alberto Paro is an engineer, project manager, and software developer. He currently works as Big Data Practice Leader in NTTDATA in Italy on big data technologies, native cloud, and NoSQL solutions. He loves to study emerging solutions and applications mainly related to cloud and big data processing, NoSQL, NLP, and neural networks. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products using the state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL datastores, and related technologies.