You're reading from Elasticsearch 8.x Cookbook - Fifth Edition

Product typeBook

Published inMay 2022

PublisherPackt

ISBN-139781801079815

Edition5th Edition

Tools

Elasticsearch Elasticsearch

Concepts

Enterprise Search

Author (1)

Alberto Paro

Chapter 8: Scripting in Elasticsearch

Elasticsearch has a powerful way of extending its capabilities by using custom scripts, which can be written in several programming languages. The most common ones are Painless, Express, and Mustache. In this chapter, we will explore how it’s possible to create custom scoring algorithms, specially processed return fields, custom sorting, complex update operations on records, and ingest processors. The scripting concept of Elasticsearch is an advanced stored-procedure system in the NoSQL world; due to this, every advanced user of Elasticsearch should learn how to master it.

Elasticsearch natively provides scripting in Java (that is, Java code compiled in JAR files), Painless, Express, and Mustache; however, a lot of other interesting languages are also available as plugins, such as Kotlin and Velocity. In older Elasticsearch releases, prior to version 5.0, the official scripting language was Groovy. But, for better sandboxing and performance...

Painless scripting

Painless is a simple, secure scripting language that is available in Elasticsearch by default. It was designed by the Elasticsearch team to be used specifically with Elasticsearch and can safely be used with inline and stored scripting. Its syntax is similar to Groovy, from which it was originally born.

In this recipe, we will see how to create a custom score function in Painless.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

In order to execute the commands, any HTTP client can be used, such as cURL (https://curl.haxx.se/) or Postman (https://www.getpostman.com/). You can use Kibana Console as it provides code completion and better character escaping for Elasticsearch.

To correctly execute the following commands, you will need an index that is populated with the ch07/populate_aggregation.txt commands...

Installing additional scripting languages

Elasticsearch provides native scripting (that is, Java code compiled in JAR files) and Painless, but a lot of other interesting languages are also available, such as Kotlin.

Note

At the time of writing this book, there are no available language plugins as part of Elasticsearch’s official ones. Usually, plugin authors will take a week or up to a month to update their plugins to the new version after a major release. This section will be a reference for this use case based on Elasticsearch 7.x. As previously stated, the official language is now Painless, and this is provided by default in Elasticsearch for better sandboxing and performance.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

How to do it...

In order to install the Kotlin language support for Elasticsearch, we...

Managing scripts

Depending on your scripting usage, there are several ways of customizing Elasticsearch in order to use your script extensions.

In this recipe, we will demonstrate how you can manage scripts by storing them in Elasticsearch or providing them inline in API calls.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

In order to execute the commands, any HTTP client can be used, such as cURL (https://curl.haxx.se/) or Postman (https://www.getpostman.com/). You can use Kibana Console, as it provides code completion and better character escaping for Elasticsearch.

To correctly execute the following commands, you will need an index populated with the ch08/populate_aggregation.txt commands – these are available in the online code.

In order to be able to use regular expressions in Painless scripting, you will...

Sorting data using scripts

Elasticsearch provides scripting support for sorting functionality. In real-world applications, there is often a need to modify the default sorting using an algorithm that is dependent on the context and some external variables. Some common scenarios are as follows:

Sorting places near a point
Sorting by most read articles
Sorting items by custom user logic
Sorting items by revenue

Because the computing of scores on a large dataset is very CPU-intensive, if you use scripting, then it’s better to execute it on a small dataset using standard score queries for detecting the top documents, and then execute a rescoring on the top subset.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

To execute the commands, any HTTP client can be used, such as cURL (https://curl.haxx.se...

Computing return fields with scripting

Elasticsearch allows us to define custom complex expressions that can be used to return a newly calculated field value.

The most common scenarios for these use cases are as follows:

Merge field values (that is, first name + last name)
Compute values (that is, total=quantity*price)
Apply transformations (that is, convert dollars to euros, string manipulation)

These special fields are called script_fields, and they can be expressed with a script in every available Elasticsearch scripting language.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

To execute the commands, any HTTP client can be used, such as cURL (https://curl.haxx.se/) or Postman (https://www.getpostman.com/). You can use Kibana Console, as it provides code completion and better character escaping for...

Filtering a search using scripting

In Chapter 4, Exploring Search Capabilities, we explored many filters. Elasticsearch scripting allows the extension of a traditional filter by using custom scripts.

Using scripting to create a custom filter is a convenient way to write scripting rules that are not provided by Lucene or Elasticsearch, and to implement business logic that is not available in a DSL query.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

To correctly execute the following commands, you will need an index that is populated with the ch07/populate_aggregation.txt commands &...

Using scripting in aggregations

Scripting can be used in aggregations for extending its analytics capabilities to manipulate and transform the values used in metric aggregations or to define new rules to create buckets.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1,Getting Started.

To correctly execute the following commands, you will need an index that is populated with the ch07/populate_aggregation.txt commands – these are available in the online code.

In order to be able to use regular expressions in Painless scripting, you will need to activate them in elasticsearch.yml by adding script.painless...

Updating a document using scripts

Elasticsearch allows you to update a document in place. Updating a document using scripting reduces network traffic (otherwise, you need to fetch the document, change the field or fields, and then send them back) and improves performance when you need to process a large number of documents.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

To correctly execute the following commands, you will need an index that is populated with the ch07/populate_aggregation.txt commands – these are available in the online code.

In order to be able to use regular...

Reindexing with a script

Reindexing is a functionality for automatically copying your data into a new index. This action is often done to cover different scenarios, as follows:

Reindexing after a mapping change
Removing a field from an index
Adding new fields based on a function

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

To execute curl using the command line, you will need to install curl for your operating system.

In order to correctly execute the following commands, you will need an index that is populated with the ch07/populate_aggregation.txt script (available in the online code), and the JavaScript or Python language scripting plugins installed.

How to do it...

For reindexing with a script, we will perform the following steps:

Create the destination index, as this is not created...

Scripting in ingest processors

In Chapter 12, Using the Ingest Module, we will see several types of ingest processors.

Ingest processors are the building blocks for an ingestion pipeline; they describe an action that can be executed on a document to modify it.

Scripting is the main functionality used in processors to provide the core functionalities for completeness. Their scripting functionalities are discussed in this chapter.

Getting ready

You will need an up-and-running Elasticsearch installation, similar to the one that we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.

How to do it...

We will simulate a pipeline with a set and a script processor. To modify our documents before ingesting them, we will perform the following steps:

Execute a pipeline simulation API call with the two processor steps and two documents as a sample:
```
POST /_ingest/pipeline/_simulate
{ “pipeline”: {
   ...
```

The rest of the chapter is locked

You have been reading a chapter from

Elasticsearch 8.x Cookbook - Fifth Edition

Published in: May 2022Publisher: PacktISBN-13: 9781801079815

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages