Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Elasticsearch Server: Second Edition

You're reading from  Elasticsearch Server: Second Edition

Product type Book
Published in Apr 2014
Publisher
ISBN-13 9781783980529
Pages 428 pages
Edition 1st Edition
Languages

Table of Contents (18) Chapters

Elasticsearch Server Second Edition
Credits
About the Author
Acknowledgments
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Getting Started with the Elasticsearch Cluster Indexing Your Data Searching Your Data Extending Your Index Structure Make Your Search Better Beyond Full-text Searching Elasticsearch Cluster in Detail Administrating Your Cluster Index

Chapter 4. Extending Your Index Structure

In the previous chapter, we learned many things about querying Elasticsearch. We saw how to choose fields that will be returned and learned how querying works in Elasticsearch. In addition to that, we now know the basic queries that are available and how to filter our data. What's more, we saw how to highlight the matches in our documents and how to validate our queries. In the end, we saw the compound queries of Elasticsearch and learned how to sort our data. By the end of this chapter, you will have learned the following topics:

  • Indexing tree-like structured data

  • Indexing data that is not flat

  • Modifying your index structure when possible

  • Indexing data with relationships by using nested documents

  • Indexing data with relationships between them by using the parent-child functionality

Indexing tree-like structures


Trees are everywhere. If you develop a shop application, you would probably have categories. If you look at the filesystem, the files and directories are arranged in tree-like structures. This book can also be represented as a tree: chapters contain topics and topics are divided into subtopics. As you can imagine, Elasticsearch is also capable of indexing tree-like structures. Let's check how we can navigate through this type of data using path_analyzer.

Data structure

First, let's create a simple index structure by using the following lines of code:

curl -XPUT 'localhost:9200/path' -d '{
  "settings" : {
    "index" : {
      "analysis" : {
        "analyzer" : {
          "path_analyzer" : { "tokenizer" : "path_hierarchy" }
        }
      }
    }
  },
  "mappings" : {
    "category" : {
      "properties" : {
        "category" : {
          "type" : "string",
          "fields" : {
            "name" : { "type" : "string","index" : "not_analyzed" },
      ...

Indexing data that is not flat


Not all data is flat like the data we have been using so far in this book. Of course, if we are building the system that Elasticsearch will be a part of, we can create a structure that is convenient for Elasticsearch. Of course, the structure can't always be flat, because not all use cases allow that. Let's see how to create mappings that use fully-structured JSON objects.

Data

Let's assume that we have the following data (we will store it in the file named structured_data.json):

{
  "book" : {
    "author" : {
      "name" : {
        "firstName" : "Fyodor",
        "lastName" : "Dostoevsky"
      }
    },
    "isbn" : "123456789",
    "englishTitle" : "Crime and Punishment",
    "year" : 1886,
    "characters" : [
      {
        "name" : "Raskolnikov"
      }, 
      {
        "name" : "Sofia"
      }
    ],
    "copies" : 0
  }
}

As you can see in the preceding code, the data is not flat; it contains arrays and nested objects. If we would like to create mappings...

Using nested objects


Nested objects can come in handy in certain situations. Basically, with nested objects, Elasticsearch allows us to connect multiple documents together—one main document and multiple dependent ones. The main document and the nested ones will be indexed together and they will be placed in the same segment of the index (actually, in the same block), which guarantees the best performance we can get for data structure. The same goes for changing the document; unless you are using the update API, you need to index the parent document and all the other nested documents at the same time.

Note

If you would like to read more about how nested objects work on the Lucene level, there is a very good blog post by Mike McCandless at http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html.

Now, let's get to our example use case. Imagine that we have a shop with clothes and we store the size and color of each t-shirt. Our standard, nonnested mappings will look similar...

Using the parent-child relationship


In the previous section, we discussed the ability to index nested documents along with the parent one. However, even though the nested documents are indexed as separate documents in the index, we can't change a single nested document (unless we use the update API). However, Elasticsearch allows us to have a real parent-child relationship and we will look at it in the following section.

Index structure and data indexing

Let's use the same example that we used when discussing the nested documents—the hypothetical cloth store. However, what we would like to have is the ability to update sizes and colors without the need to index the whole document after each change.

Parent mappings

The only field we need to have in our parent document is name. We don't need anything more than that. So, in order to create our cloth type in the shop index, we will run the following commands:

curl -XPOST 'localhost:9200/shop'
curl -XPUT 'localhost:9200/shop/cloth/_mapping' -d '...

Modifying your index structure with the update API


In the previous chapters, we discussed how to create index mappings and index the data. But what if you already have the mappings created and data indexed, but want to modify the structure of the index? This is possible to some extent. For example, by default, if we index a document with a new field, Elasticsearch will add that field to the index structure. Let's now look at how to modify the index structure manually.

The mappings

Let's assume that we have the following mappings for our users index stored in the user.json file:

{
  "user" : {
    "properties" : {
      "name" : {"type" : "string"}
    }
  }
}

As you can see, it is very simple. It just has a single property that will hold the username. Now, let's create an index called users, and use the previous mappings to create our own type. To do that, we will run the following commands:

curl -XPOST 'localhost:9200/users'
curl -XPUT 'localhost:9200/users/user/_mapping' -d @user.json

If everything...

Summary


In this chapter, we learned how to index tree-like structures using Elasticsearch. In addition to that, we indexed data that is not flat and modified the structure of already-created indices. Finally, we learned how to handle relationships by using nested documents and by using the Elasticsearch parent-child functionality.

In the next chapter, we'll focus on making our search even better. We will see how Apache Lucene scoring works and why it matters so much. We will learn how to use the Elasticsearch function-score query to adjust the importance of our documents using different functions and we'll leverage the provided scripting capabilities. We will search the content in different languages and discuss when index time-boosting makes sense. We'll use synonyms to match words with the same meaning and we'll learn how to check why a given document was found by a query. Finally, we'll influence queries with boosts, and we will learn how to understand the score calculation done by Elasticsearch...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Elasticsearch Server: Second Edition
Published in: Apr 2014 Publisher: ISBN-13: 9781783980529
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}