Reader small image

You're reading from  Elasticsearch 8.x Cookbook - Fifth Edition

Product typeBook
Published inMay 2022
PublisherPackt
ISBN-139781801079815
Edition5th Edition
Right arrow
Author (1)
Alberto Paro
Alberto Paro
author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Right arrow

Managing nested objects

There is a special type of embedded object called a nested object. This resolves a problem related to Lucene's indexing architecture, in which all the fields of embedded objects are viewed as a single object (technically speaking, they are flattened). During the search, in Lucene, it is not possible to distinguish between values and different embedded objects in the same multi-valued array.

If we consider the previous order example, it's not possible to distinguish an item's name and its quantity with the same query since Lucene puts them in the same Lucene document object. We need to index them in different documents and then join them. This entire trip is managed by nested objects and nested queries.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

A nested object is defined as a standard object with the nested type.

Regarding the example in the Mapping an object recipe, we can change the type from object to nested, as follows:

PUT test/_mapping
{ "properties" : {
      "id" : {"type" : "keyword"},
      "date" : {"type" : "date"},
      "customer_id" : {"type" : "keyword"},
      "sent" : {"type" : "boolean"},
      "item" : {"type" : "nested",
        "properties" : {
            "name" : {"type" : "keyword"},
            "quantity" : {"type" : "long"},
            "price" : {"type" : "double"},
            "vat" : {"type" : "double"}
} } } }

How it works…

When a document is indexed, if an embedded object has been marked as nested, it's extracted by the original document before being indexed in a new external document and saved in a special index position near the parent document.

In the preceding example, we reused the mapping from the Mapping an object recipe, but we changed the type of the item from object to nested. No other action must be taken to convert an embedded object into a nested one.

The nested objects are special Lucene documents that are saved in the same block of data as its parent – this approach allows for fast joining with the parent document.

Nested objects are not searchable with standard queries, only with nested ones. They are not shown in standard query results.

The lives of nested objects are related to their parents: deleting/updating a parent automatically deletes/updates all the nested children. Changing the parent means Elasticsearch will do the following:

  • Mark old documents as deleted.
  • Mark all nested documents as deleted.
  • Index the new document version.
  • Index all nested documents.

There's more...

Sometimes, you must propagate information about the nested objects to their parent or root objects. This is mainly to build simpler queries about the parents (such as terms queries without using nested ones). To achieve this, two special properties of nested objects must be used:

  • include_in_parent: This makes it possible to automatically add the nested fields to the immediate parent.
  • include_in_root: This adds the nested object fields to the root object.

These settings add data redundancy, but they reduce the complexity of some queries, thus improving performance.

See also

  • Nested objects require a special query to search for them – this will be discussed in the Using nested queries recipe of Chapter 6, Relationships and Geo Queries.
  • The Managing a child document with a join field recipe shows another way to manage child/parent relationships between documents.
Previous PageNext Page
You have been reading a chapter from
Elasticsearch 8.x Cookbook - Fifth Edition
Published in: May 2022Publisher: PacktISBN-13: 9781801079815
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Alberto Paro

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro