Reader small image

You're reading from  Elasticsearch 8.x Cookbook - Fifth Edition

Product typeBook
Published inMay 2022
PublisherPackt
ISBN-139781801079815
Edition5th Edition
Right arrow
Author (1)
Alberto Paro
Alberto Paro
author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Right arrow

Mapping arrays

Array or multi-value fields are very common in data models (such as multiple phone numbers, addresses, names, aliases, and so on), but they're not natively supported in traditional SQL solutions.

In SQL, multi-value fields require you to create accessory tables that must be joined to gather all the values, leading to poor performance when the cardinality of the records is huge.

Elasticsearch, which works natively in JSON, provides support for multi-value fields transparently.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

To use an Array type in our mapping, perform the following steps:

  1. Every field is automatically managed as an array. For example, to store tags for a document, the mapping would be as follows:
    {  "properties" : {
          "name" : {"type" : "keyword"},
          "tag" : {"type" : "keyword", "store" : true},
          ...
    }
  2. This mapping is valid for indexing both documents. The following is the code for document1:
    {"name": "document1", "tag": "awesome"}
  3. The following is the code for document2:
    {"name": "document2", "tag": ["cool", "awesome", "amazing"] }

How it works…

Elasticsearch transparently manages the array: there is no difference if you declare a single value or a multi-value due to its Lucene core nature.

Multi-values for fields are managed in Lucene, so you can add them to a document with the same field name. For people with a SQL background, this behavior may be quite strange, but this is a key point in the NoSQL world as it reduces the need for a join query and creates different tables to manage multi-values. An array of embedded objects has the same behavior as simple fields.

Previous PageNext Page
You have been reading a chapter from
Elasticsearch 8.x Cookbook - Fifth Edition
Published in: May 2022Publisher: PacktISBN-13: 9781801079815
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro