Reader small image

You're reading from  Elasticsearch 8.x Cookbook - Fifth Edition

Product typeBook
Published inMay 2022
PublisherPackt
ISBN-139781801079815
Edition5th Edition
Right arrow
Author (1)
Alberto Paro
Alberto Paro
author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Right arrow

Using the Flattened field type

In many applications, it is possible to define custom metadata or configuration composed of key-value pairs. This use case is not optimal for Elasticsearch. Creating a new mapping for every key will not be easy to manage as they evolve into large mappings.

X-Pack provides a type (free for use) to solve this problem: the flattened field type.

As the name suggests, it takes all the key-value pairs (also nested ones) and indices them in a flat way, thus solving the problem of the mapping explosion.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

We want to use Elasticsearch to store configurations with a varying number of fields. To achieve this, follow these steps:

  1. To create our configuration index with a flattened field, we will use the following mapping:
    PUT test-flattened
    { "mappings": {
        "properties": {
          "name": { "type": "keyword" },
          "configs": { "type": "flattened" } } } }
  2. Now, we can store some documents that contain our configuration data:
    PUT test-flattened/_bulk
    {"index":{"_index":"test-flattened","_id":"1"}}
    {"name":"config1","configs":{"key1":"value1","key3":"2022-01-01T12:00:01"}}
    {"index":{"_index":"test-flattened","_id":"2"}}
    {"name":"config2","configs":{"key1":true,"key2":30}}
    {"index":{"_index":"test-flattened","_id":"3"}}
    {"name":"config3","configs":{"key4":"test","key2":30.3}}
  3. Now, we can execute a query that's searching for the text in all the configurations:
    POST test-flattened/_search
    { "query": { "term": { "configs": "test" } } }

Alternatively, we can search for a particular key in the configs object, like so:

POST test-flattened/_search
{ "query": { "term": { "configs.key4": "test" } } }

The result for both queries will be as follows:

{ …truncated…
    "hits" : [
            {
        "_index" : "test-flattened", 
        "_id" : "3",  "_score" : 1.2330425,
        "_source" : {
          "name" : "config3",
          "configs" : { "key4" : "test", "key2" : 30.3    }
    …truncated…

How it works…

This special field type can take a JSON object that's been passed in a document and flatten key/value pairs that can be searched without defining a mapping for fields in the JSON content.

This helps since the mapping can explode due to the JSON containing a large number of different fields.

During the indexing process, tokens are created for each leaf value of the JSON object using a keyword analyzer. Due to this, the number, date, IP, and other formats are converted into text and the only queries that can be executed are the ones that are supported by keyword tokenization. This includes term, terms, terms_set, prefix, range (this is based on text), match, multi_match, query_string, simple_query_string, and exists.

See also

See Chapter 5, Text and Numeric Queries, for more references on the cited query types.

Previous PageNext Page
You have been reading a chapter from
Elasticsearch 8.x Cookbook - Fifth Edition
Published in: May 2022Publisher: PacktISBN-13: 9781801079815
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro