ElasticSearch Cookbook - Second Edition

Over 130 advanced recipes to search, analyze, deploy, manage, and monitor data effectively with ElasticSearch

ElasticSearch Cookbook - Second Edition

Cookbook
Alberto Paro

1 customer reviews
Over 130 advanced recipes to search, analyze, deploy, manage, and monitor data effectively with ElasticSearch
$32.99
$54.99
RRP $32.99
RRP $54.99
eBook
Print + eBook
Preview in Mapt

Book Details

ISBN 139781783554836
Paperback472 pages

Book Description

This book will guide you through the complete ElasticSearch ecosystem. From choosing the correct transport layer and communicating with the server to creating and customizing internal actions, you will develop an in-depth knowledge of the implementation of the ElasticSearch architecture.

After creating complex queries and analytics, mapping, aggregation, and scripting, you will master the integration of ElasticSearch's functionality in user-facing applications and take your knowledge one-step further by building custom plugins, developing tailored mapping, executing powerful analytics, and integrating with Python and Java applications.

 

 

Read an Extract from the book

Communicating with ElasticSearch

You can communicate with several protocols using your ElasticSearch server. In this recipe,
we will take a look at the main protocols.

Getting ready

You will need a working instance of the ElasticSearch cluster.

How it works...

ElasticSearch is designed to be used as a RESTful server, so the main protocol is the HTTP, usually on port number 9200 and above. Thus, it allows using different protocols such as native and thrift ones.

Many others are available as extension plugins, but they are seldom used, such as memcached, couchbase, and websocket. (If you need to find more on the transport layer, simply type in Elasticsearch transport on the GitHub website to search.)

Every protocol has advantages and disadvantages. It's important to choose the correct one depending on the kind of applications you are developing. If you are in doubt, choose the HTTP Protocol layer that is the standard protocol and is easy to use.

Choosing the right protocol depends on several factors, mainly architectural and performance related. This schema factorizes advantages and disadvantages related to them. If you are using any of the protocols to communicate with ElasticSearch official clients, switching from a protocol to another is generally a simple setting in the client initialization.

Protocol

Advantages

Disadvantages

Type

HTTP

  • Frequently used
  • API is safe and has general compatibility for different versions of ES, although JSON is suggested
  • HTTP overhead
  • Text

Native

  • Fast network layer
  • Programmatic
  • Best for massive indexing operations
  • If the API changes, it can break the applications
  • Requires the same version of the ES server
  • Only on JVM
  • Binary

Thrift

  • Similar to HTTP
  • Related to the Thrift plugin
  • Binary

 

 

Table of Contents

Chapter 1: Getting Started
Introduction
Understanding nodes and clusters
Understanding node services
Managing your data
Understanding clusters, replication, and sharding
Communicating with ElasticSearch
Using the HTTP protocol
Using the native protocol
Using the Thrift protocol
Chapter 2: Downloading and Setting Up
Introduction
Downloading and installing ElasticSearch
Setting up networking
Setting up a node
Setting up for Linux systems
Setting up different node types
Installing plugins in ElasticSearch
Installing a plugin manually
Removing a plugin
Changing logging settings
Chapter 3: Managing Mapping
Introduction
Using explicit mapping creation
Mapping base types
Mapping arrays
Mapping an object
Mapping a document
Using dynamic templates in document mapping
Managing nested objects
Managing a child document
Adding a field with multiple mappings
Mapping a geo point field
Mapping a geo shape field
Mapping an IP field
Mapping an attachment field
Adding metadata to a mapping
Specifying a different analyzer
Mapping a completion suggester
Chapter 4: Basic Operations
Introduction
Creating an index
Deleting an index
Opening/closing an index
Putting a mapping in an index
Getting a mapping
Deleting a mapping
Refreshing an index
Flushing an index
Optimizing an index
Checking if an index or type exists
Managing index settings
Using index aliases
Indexing a document
Getting a document
Deleting a document
Updating a document
Speeding up atomic operations (bulk operations)
Speeding up GET operations (multi GET)
Chapter 5: Search, Queries, and Filters
Introduction
Executing a search
Sorting results
Highlighting results
Executing a scan query
Suggesting a correct query
Counting matched results
Deleting by query
Matching all the documents
Querying/filtering for a single term
Querying/filtering for multiple terms
Using a prefix query/filter
Using a Boolean query/filter
Using a range query/filter
Using span queries
Using a match query
Using an ID query/filter
Using a has_child query/filter
Using a top_children query
Using a has_parent query/filter
Using a regexp query/filter
Using a function score query
Using exists and missing filters
Using and/or/not filters
Using a geo bounding box filter
Using a geo polygon filter
Using geo distance filter
Using a QueryString query
Using a template query
Chapter 6: Aggregations
Introduction
Executing an aggregation
Executing the stats aggregation
Executing the terms aggregation
Executing the range aggregation
Executing the histogram aggregation
Executing the date histogram aggregation
Executing the filter aggregation
Executing the global aggregation
Executing the geo distance aggregation
Executing nested aggregation
Executing the top hit aggregation
Chapter 7: Scripting
Introduction
Installing additional script plugins
Managing scripts
Sorting data using script
Computing return fields with scripting
Filtering a search via scripting
Updating a document using scripts
Chapter 8: Rivers
Introduction
Managing a river
Using the CouchDB river
Using the MongoDB river
Using the RabbitMQ river
Using the JDBC river
Using the Twitter river
Chapter 9: Cluster and Node Monitoring
Introduction
Controlling cluster health via the API
Controlling cluster state via the API
Getting cluster node information via the API
Getting node statistics via the API
Managing repositories
Executing a snapshot
Restoring a snapshot
Installing and using BigDesk
Installing and using ElasticSearch Head
Installing and using SemaText SPM
Installing and using Marvel
Chapter 10: Java Integration
Introduction
Creating an HTTP client
Creating a native client
Managing indices with the native client
Managing mappings
Managing documents
Managing bulk actions
Building a query
Executing a standard search
Executing a search with aggregations
Executing a scroll/scan search
Chapter 11: Python Integration
Introduction
Creating a client
Managing indices
Managing mappings
Managing documents
Executing a standard search
Executing a search with aggregations
Chapter 12: Plugin Development
Introduction
Creating a site plugin
Creating a native plugin
Creating a REST plugin
Creating a cluster action
Creating an analyzer plugin
Creating a river plugin

What You Will Learn

  • Make ElasticSearch work for you by choosing the best cloud topology and powering it with plugins
  • Develop tailored mapping to take full control of index steps
  • Build complex queries through managing indices and documents
  • Optimize search results through executing analytics aggregations
  • Manage rivers (SQL, NoSQL, and web-based) to synchronize and populate cross-source data
  • Develop web interfaces to execute key tasks
  • Monitor the performance of the cluster and nodes

Authors

Table of Contents

Chapter 1: Getting Started
Introduction
Understanding nodes and clusters
Understanding node services
Managing your data
Understanding clusters, replication, and sharding
Communicating with ElasticSearch
Using the HTTP protocol
Using the native protocol
Using the Thrift protocol
Chapter 2: Downloading and Setting Up
Introduction
Downloading and installing ElasticSearch
Setting up networking
Setting up a node
Setting up for Linux systems
Setting up different node types
Installing plugins in ElasticSearch
Installing a plugin manually
Removing a plugin
Changing logging settings
Chapter 3: Managing Mapping
Introduction
Using explicit mapping creation
Mapping base types
Mapping arrays
Mapping an object
Mapping a document
Using dynamic templates in document mapping
Managing nested objects
Managing a child document
Adding a field with multiple mappings
Mapping a geo point field
Mapping a geo shape field
Mapping an IP field
Mapping an attachment field
Adding metadata to a mapping
Specifying a different analyzer
Mapping a completion suggester
Chapter 4: Basic Operations
Introduction
Creating an index
Deleting an index
Opening/closing an index
Putting a mapping in an index
Getting a mapping
Deleting a mapping
Refreshing an index
Flushing an index
Optimizing an index
Checking if an index or type exists
Managing index settings
Using index aliases
Indexing a document
Getting a document
Deleting a document
Updating a document
Speeding up atomic operations (bulk operations)
Speeding up GET operations (multi GET)
Chapter 5: Search, Queries, and Filters
Introduction
Executing a search
Sorting results
Highlighting results
Executing a scan query
Suggesting a correct query
Counting matched results
Deleting by query
Matching all the documents
Querying/filtering for a single term
Querying/filtering for multiple terms
Using a prefix query/filter
Using a Boolean query/filter
Using a range query/filter
Using span queries
Using a match query
Using an ID query/filter
Using a has_child query/filter
Using a top_children query
Using a has_parent query/filter
Using a regexp query/filter
Using a function score query
Using exists and missing filters
Using and/or/not filters
Using a geo bounding box filter
Using a geo polygon filter
Using geo distance filter
Using a QueryString query
Using a template query
Chapter 6: Aggregations
Introduction
Executing an aggregation
Executing the stats aggregation
Executing the terms aggregation
Executing the range aggregation
Executing the histogram aggregation
Executing the date histogram aggregation
Executing the filter aggregation
Executing the global aggregation
Executing the geo distance aggregation
Executing nested aggregation
Executing the top hit aggregation
Chapter 7: Scripting
Introduction
Installing additional script plugins
Managing scripts
Sorting data using script
Computing return fields with scripting
Filtering a search via scripting
Updating a document using scripts
Chapter 8: Rivers
Introduction
Managing a river
Using the CouchDB river
Using the MongoDB river
Using the RabbitMQ river
Using the JDBC river
Using the Twitter river
Chapter 9: Cluster and Node Monitoring
Introduction
Controlling cluster health via the API
Controlling cluster state via the API
Getting cluster node information via the API
Getting node statistics via the API
Managing repositories
Executing a snapshot
Restoring a snapshot
Installing and using BigDesk
Installing and using ElasticSearch Head
Installing and using SemaText SPM
Installing and using Marvel
Chapter 10: Java Integration
Introduction
Creating an HTTP client
Creating a native client
Managing indices with the native client
Managing mappings
Managing documents
Managing bulk actions
Building a query
Executing a standard search
Executing a search with aggregations
Executing a scroll/scan search
Chapter 11: Python Integration
Introduction
Creating a client
Managing indices
Managing mappings
Managing documents
Executing a standard search
Executing a search with aggregations
Chapter 12: Plugin Development
Introduction
Creating a site plugin
Creating a native plugin
Creating a REST plugin
Creating a cluster action
Creating an analyzer plugin
Creating a river plugin

Book Details

ISBN 139781783554836
Paperback472 pages
Read More
From 1 reviews

Read More Reviews