Reader small image

You're reading from  Elasticsearch 5.x Cookbook - Third Edition

Product typeBook
Published inFeb 2017
Publisher
ISBN-139781786465580
Edition3rd Edition
Right arrow
Author (1)
Alberto Paro
Alberto Paro
author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Right arrow

Chapter 9. Scripting

In this chapter, we will cover the following recipes:

  • Painless scripting

  • Installing additional script plugins

  • Managing scripts

  • Sorting data using scripts

  • Computing return fields with scripting

  • Filtering a search via scripting

  • Using scripting in aggregations

  • Updating a document using scripts

  • Reindexing with a script

Introduction


Elasticsearch has a powerful way to extend its capabilities with custom scripts that can be written in several programming languages. The most common ones are Painless, Groovy, JavaScript, and Python.

In this chapter, we will see how it's possible to create custom scoring algorithms, special processed return fields, custom sorting, and complex update operations on records.

The scripting concept of Elasticsearch is an advanced stored procedures system in the NoSQL world; so, for an advanced use of Elasticsearch, it is very important to master it.

Elasticsearch natively provides scripting in Java (a Java code compiled in JAR), Painless, Groovy, Express, and Mustache; but a lot of interesting languages are available as plugins, such as JavaScript and Python.

In older Elasticsearch releases, prior to version 5.0, the official scripting language was Groovy, but for better sandboxing and performance, the official language is now Painless, which is provided by default in Elasticsearch...

Painless scripting


Painless is a simple, secure scripting language available in Elasticsearch by default. It was designed by Elasticsearch guys specifically to be used with Elasticsearch and can safely be used with inline and stored scripting. Its syntax is similar to Groovy.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command line you need to install curl on your operating system.

To be able to use regular expressions in Painless scripting, you need to activate them in your elasticsearch.yml adding the following:

    script.painless.regex.enabled: true

To correctly execute the following commands, you need an index populated with the chapter_09/populate_for_scripting.sh script available in the online code.

How to do it...

We'll use Painless scripting to compute the scoring with a script. A script code requires us to correctly escape special...

Installing additional script plugins


Elasticsearch provides native scripting (a Java code compiled in JAR) and Painless, but a lot of interesting languages are available, such as JavaScript and Python.

As previously stated, the official language is now Painless, and this is provided by default in Elasticsearch for better sandboxing and performance.

Note

Other scripting languages can be installed as plugins, thus they are now deprecated. We will present them in this recipe as they have a large user base.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

How to do it...

To install JavaScript language support for Elasticsearch, we will perform the following steps:

  1. From the command line, simply call the following command:

            bin/elasticsearch-plugin install lang-javascript
    
  2. It will print the following output:

                -> Downloading lang-javascript from elastic
         ...

Managing scripts


Depending on your scripting usage, there are several ways to customize Elasticsearch to use your script extensions.

In this recipe, we will see how to provide scripts to Elasticsearch via files, index or inline.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line you need to install curl for your operative system.

To correctly execute the following commands, you need an index populated with the chapter_09/populate_for_scripting.sh script available in the online code.

How to do it...

To manage scripting, we will perform the following steps:

  1. Dynamic scripting (except Painless) is disabled by default for security reasons. We need to activate it to use dynamic scripting languages such as JavaScript and Python. To do this, we need to enable scripting flags in the Elasticsearch config file (config/elasticseach.yml) and restart...

Sorting data using scripts


Elasticsearch provides scripting support for sorting functionality. In real-world applications, there is often a need to modify the default sort by match score using an algorithm that depends on the context and some external variables. Some common scenarios are as follows:

  • Sorting places near a point

  • Sorting by most read articles

  • Sorting items by custom user logic

  • Sorting items by revenue

Tip

Because the compute of scores on a large dataset is very CPU intensive, if you use scripting it's better execute it on a small dataset using standard score queries for detecting the top documents, and then execute a rescoring on the top subset.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line you need to install curl for your operating system.

To correctly execute the following commands, you need an index populated with the...

Computing return fields with scripting


Elasticsearch allows us to define complex expressions that can be used to return a new calculated field value.

These special fields are called script_fields, and they can be expressed with a script in every available Elasticsearch scripting language.

Getting ready

You need an up-and-running Elasticsearch installation as we described the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line you need to install curl for your operating system.

To correctly execute the following commands, you need an index populated with the script chapter_09/populate_for_scripting.sh script available in the online code.

How to do it...

For computing return fields with scripting, we will perform the following steps:

  1. Return the following script fields:

    "my_calc_field": This concatenates the texts of the "name" and "description" fields

    "my_calc_field2": This multiplies the "price" value by the "discount" parameter

  2. From...

Filtering a search via scripting


In Chapter 5, Search, we've seen many filters. Elasticsearch scripting allows the extension of a traditional filter with custom scripts.

Using scripting to create a custom filter is a convenient way to write scripting rules not provided by Lucene or Elasticsearch, and to implement business logic not available in a DSL query.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line you need to install curl for your operating system.

To correctly execute the following commands, you need an index populated with the script (chapter_09/populate_for_scripting.sh) available in the online code and Javascript/Python Language scripting plugins installed.

How to do it...

For filtering a search using script, we will perform the following steps:

  1. We'll write a search with a filter that filters out a document with an age value...

Using scripting in aggregations


Scripting can be used in aggregations for extending its analytics capabilities both to change values used in metric aggregations or to define new rules to create buckets.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line you need to install curl for your operating system.

To correctly execute the following commands, you need an index populated with the chapter_09/populate_for_scripting.sh script available in the sonline code and Javascript/Python Language scripting plugins installed.

How to do it...

For using a scripting language in aggregation, we will perform the following steps:

  1. We'll write a metric aggregation that selects the field via script:

            curl -XPOST 'http://127.0.0.1:9200/test-index/test-
            type/_search?pretty=true&size=0' -d ' { 
              "aggs": { 
                "my_value":...

Updating a document using scripts


Elasticsearch allows the updating of a document in-place. Updating a document via scripting reduces network traffic (otherwise, you need to fetch the document, change the field/fields, and send them back) and improves performance when you need to process a huge amount of documents.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line you need to install curl for your operating system.

To correctly execute the following commands, you need an index populated with the chapter_09/populate_for_scripting.sh script available in the online code and Javascript/Python language scripting plugins installed.

How to do it...

For updating using scripting, we will perform the following steps:

  1. We'll write an update action that adds a tag value to a list of tags available in the source of a document. It should look as shown...

Reindexing with a script


Reindex is a new functionality introduced in Elasticsearch 5.x for automatically reindexing your data in a new index. This action is often done for a variety of reasons, mainly mapping changes due to an improvement in mappings.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command-line you need to install curl for your operating system.

To correctly execute the following commands, you need an index populated with the chapter_09/populate_for_scripting.sh script available in the online code and Javascript/Python language scripting plugins installed.

How to do it...

For reindexing with a script, we will perform the following steps:

  1. We create the destination index, because it's not created by reindex API:

                curl -XPUT 'http://127.0.0.1:9200/reindex-test-index?
                pretty=true' -d '{"mappings": {"test-type...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Elasticsearch 5.x Cookbook - Third Edition
Published in: Feb 2017Publisher: ISBN-13: 9781786465580
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro