Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Elasticsearch 8.x Cookbook - Fifth Edition

You're reading from  Elasticsearch 8.x Cookbook - Fifth Edition

Product type Book
Published in May 2022
Publisher Packt
ISBN-13 9781801079815
Pages 750 pages
Edition 5th Edition
Languages
Author (1):
Alberto Paro Alberto Paro
Profile icon Alberto Paro

Table of Contents (20) Chapters

Preface Chapter 1: Getting Started Chapter 2: Managing Mappings Chapter 3: Basic Operations Chapter 4: Exploring Search Capabilities Chapter 5: Text and Numeric Queries Chapter 6: Relationships and Geo Queries Chapter 7: Aggregations Chapter 8: Scripting in Elasticsearch Chapter 9: Managing Clusters Chapter 10: Backups and Restoring Data Chapter 11: User Interfaces Chapter 12: Using the Ingest Module Chapter 13: Java Integration Chapter 14: Scala Integration Chapter 15: Python Integration Chapter 16: Plugin Development Chapter 17: Big Data Integration Chapter 18: X-Pack Other Books You May Enjoy

Specifying different analyzers

In the previous recipes, we learned how to map different fields and objects in Elasticsearch, and we described how easy it is to change the standard analyzer with the analyzer and search_analyzer properties.

In this recipe, we will look at several analyzers and learn how to use them to improve indexing and searching quality.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1Getting Started.

How to do it…

Every core type field allows you to specify a custom analyzer for indexing and for searching as field parameters.

For example, if we want the name field to use a standard analyzer for indexing and a simple analyzer for searching, the mapping will be as follows:

{ "name": {
    "type": "string",
    "index_analyzer": "standard",
    "search_analyzer": "simple"
  } }

How it works…

The concept of the analyzer comes from Lucene (the core of Elasticsearch). An analyzer is a Lucene element that is composed of a tokenizer that splits text into tokens, as well as one or more token filters. These filters carry out token manipulation such as lowercasing, normalization, removing stop words, stemming, and so on.

During the indexing phase, when Elasticsearch processes a field that must be indexed, an analyzer is chosen. First, it checks whether it is defined in the index_analyzer field, then in the document, and finally, in the index.

Choosing the correct analyzer is essential to getting good results during the query phase.

Elasticsearch provides several analyzers in its standard installation. The following table shows the most common ones:

Figure 2.4 – List of the most common general-purpose analyzers

Figure 2.4 – List of the most common general-purpose analyzers

For special language purposes, Elasticsearch supports a set of analyzers aimed at analyzing text in a specific language, such as Arabic, Armenian, Basque, Brazilian, Bulgarian, Catalan, Chinese, CJK, Czech, Danish, Dutch, English, Finnish, French, Galician, German, Greek, Hindi, Hungarian, Indonesian, Italian, Norwegian, Persian, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish, and Thai.

See also

Several Elasticsearch plugins extend the list of available analyzers. The most famous ones are as follows:

You have been reading a chapter from
Elasticsearch 8.x Cookbook - Fifth Edition
Published in: May 2022 Publisher: Packt ISBN-13: 9781801079815
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}