You're reading from Elasticsearch 8.x Cookbook - Fifth Edition

Product typeBook

Published inMay 2022

PublisherPackt

ISBN-139781801079815

Edition5th Edition

Tools

Elasticsearch Elasticsearch

Concepts

Enterprise Search

Author (1)

Alberto Paro

Managing a child document with a join field

In the previous recipe, we saw how it's possible to manage relationships between objects with the nested object type. The disadvantage of nested objects is their dependence on their parents. If you need to change the value of a nested object, you need to reindex the parent (this causes a potential performance overhead if the nested objects change too quickly). To solve this problem, Elasticsearch allows you to define child documents.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.

How to do it…

In the following example, we have two related objects: an Order and an Item.

Their UML representation is as follows:

Figure 2.3 – UML example of an Order/Item relationship

The final mapping should merge the field definitions of both Order and Item, as well as use a special field (join_field, in this example) that takes the parent/child relationship.

To use join_field, follow these steps:

First, we must define the mapping, as follows:

PUT test1/_mapping
{ "properties": {
    "join_field": {
      "type": "join", "relations": { "order": "item" }
    },
    "id": { "type": "keyword" },
    "date": { "type": "date" },
    "customer_id": { "type": "keyword" },
    "sent": { "type": "boolean" },
    "name": { "type": "text" },
    "quantity": { "type": "integer" },
    "vat": { "type": "double" }
} }

The preceding mapping is very similar to the one in the previous recipe.

If we want to store the joined records, we will need to save the parent first and then the children, like so:

PUT test/_doc/1?refresh
{ "id": "1", "date": "2018-11-16T20:07:45Z", "customer_id": "100", "sent": true, "join_field": "order" }
PUT test/_doc/c1?routing=1&refresh
 { "name": "tshirt", "quantity": 10, "price": 4.3, "vat": 8.5,
   "join_field": { "name": "item", "parent": "1" } }

The child item requires special management because we need to add routing with the parent (1 in the preceding example). Furthermore, we need to specify the parent name and its ID in the object.

How it works…

Mapping, in the case of multiple item relationships in the same index, needs to be computed as the sum of all the other mapping fields.

The relationship between objects must be defined in join_field.

There must only be a single join_field for mapping; if you need to provide a lot of relationships, you can provide them in the relations object.

The child document must be indexed in the same shard as the parent; so, when indexed, an extra parameter must be passed, which is routing (we'll learn how to do this in the Indexing a document recipe in Chapter 3, Basic Operations).

A child document doesn't need to reindex the parent document when we want to change its values. Consequently, it's fast in terms of indexing, reindexing (updating), and deleting.

There's more...

In Elasticsearch, we have different ways to manage relationships between objects, as follows:

Embedding with type=object: This is implicitly managed by Elasticsearch and it considers the embedding as part of the main document. It's fast, but you need to reindex the main document to change the value of the embedded object.
Nesting with type=nested: This allows you to accurately search and filter the parent by using nested queries on children. Everything works for the embedded object except for the query (you must use a nested query to search for them).
External children documents: Here, the children are the external document, with a join_field property to bind them to the parent. They must be indexed in the same shard as the parent. The join with the parent is a bit slower than the nested one. This is because the nested objects are in the same data block as the parent in the Lucene index and they are loaded with the parent; otherwise, the child document requires more read operations.

Choosing how to model the relationship between objects depends on your application scenario.

Tip

There is also another approach that can be used, but on big data documents, it creates poor performance – decoupling a join relationship. You do the join query in two steps: first, collect the ID of the children/other documents and then search for them in a field of their parent.

Alberto Paro is an engineer, manager, and software developer. He currently works as technology architecture delivery associate director of the Accenture Cloud First data and AI team in Italy. He loves to study emerging solutions and applications, mainly related to cloud and big data processing, NoSQL, Natural language processing (NLP), software development, and machine learning. In 2000, he graduated in computer science engineering from Politecnico di Milano. Then, he worked with many companies, mainly using Scala/Java and Python on knowledge management solutions and advanced data mining products, using state-of-the-art big data software. A lot of his time is spent teaching how to effectively use big data solutions, NoSQL data stores, and related technologies.
Read more about Alberto Paro

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Elasticsearch 8.x Cookbook - Fifth Edition

Managing a child document with a join field

Getting ready

How to do it…

How it works…

There's more...

See also

Author (1)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook