Reader small image

You're reading from  Learning Elastic Stack 6.0

Product typeBook
Published inDec 2017
PublisherPackt
ISBN-139781787281868
Edition1st Edition
Right arrow
Authors (2):
Pranav Shukla
Pranav Shukla
author image
Pranav Shukla

Pranav Shukla is the founder and CEO of Valens DataLabs, a technologist, husband, and father of two. He is a big data architect and software craftsman who uses JVM-based languages. Pranav has diverse experience of over 14 years in architecting enterprise applications for Fortune 500 companies and start-ups. His core expertise lies in building JVM-based, scalable, reactive, and data-driven applications using Java/Scala, the Hadoop ecosystem, Apache Spark, and NoSQL databases. He is a big data engineering, analytics, and machine learning enthusiast.
Read more about Pranav Shukla

Sharath Kumar M N
Sharath Kumar M N
author image
Sharath Kumar M N

Sharath Kumar M N did his master's in computer science at the University of Texas, Dallas, USA. He is currently working as a senior principal architect at Broadcom. Prior to this, he was working as an Elasticsearch solutions architect at Oracle. He has given several tech talks at conferences such as Oracle Code events. Sharath is a certified trainer Elastic Certified Instructor one of the few technology experts in the world who has been certified by Elastic Inc. to deliver their official from the creators of Elastic training. He is also a data science and machine learning enthusiast. In his free time, he likes playing with his lovely niece, Monisha; nephew, Chirayu; and his pet, Milo.
Read more about Sharath Kumar M N

View More author details
Right arrow

Chapter 3. Searching-What is Relevant

One of the core strengths of Elasticsearch is its search capabilities. In the previous chapter, we gained a good understanding of Elasticsearch's core concepts, its REST API, and its basic operations. With all that knowledge at hand, we will further our journey by learning about Elastic Stack. We will cover the following topics in this chapter.

  • Basics of text analysis
  • Searching from structured data
  • Writing compound queries
  • Searching from full-text

Basics of text analysis


Analysis of text data is different to other types of data analysis such as numbers, dates, and time. The analysis of numeric and date/time datatypes can be done in a very definitive way. For example, if you are looking for all records with a price greater than or equal to 50, the result is a simple yes or no for each record. Either the record in question qualifies or doesn't qualify for inclusion in the query's result. Similarly, when querying something by date or time, the criteria for searching through the records is very clearly defined—a record either falls into the date/time range or it doesn't.

However, the analysis of text/string data can be different. Text data can be of a different nature, and it can be used for structured analysis or unstructured analysis.

Some examples of structured types of string fields are as follows: country codes, product codes, non-numeric serial numbers/identifiers, and so on. The datatype of these fields may be a string, but often...

Searching from structured data


In certain situations, we want to find out whether the given document should be included or not; that is, a simple binary answer. On the other hand, there are other types of queries which are relevance-based. Such relevance-based queries also return a score against each document to say how well that document fits the query. Most structured queries do not need relevance-based scoring, and the answer is a simple yes/no for any item to be included or excluded from the result. These structured search queries are also referred to as term level queries.

Let us understand the flow of a term-level query's execution:

Fig-3.2 Term level query flow

As you can see, the figure is divided into two parts. The left half of the figure depicts what happens at the time of indexing, and the right half of the figure depicts what happens at query time when a term-level query is executed.

Looking at the left half of the figure, we can see what happens during indexing. Here, specifically...

Searching from full text


Full-text queries can work on unstructured text fields. These queries are aware of the analysis process. Full-text queries apply the analyzer on the search terms before performing the actual search operation. It finds out the right analyzer to be applied by first checking if a field-level search_analyzer is defined, and then by checking if a field-level analyzer is defined. If analyzers at field level are not defined, it tries the analyzer defined at the index level.

The full-text queries are thus aware of the analysis process on the underlying field and apply the right analysis process before forming the actual search queries. These analysis-aware queries are also called high-level queries. Let us understand how the high-level query flow works.

Here, we can see how one high-level query on the field title will be executed. Remember from our index definition earlier that the title field is of the text type. At indexing time, the value is analyzed using the analyzer...

Writing compound queries


This class of queries can be used to combine one or more queries to come up with a more complex query. Some compound queries convert scoring queries into non-scoring queries, and combine multiple scoring and non-scoring queries. We will look at the following compound queries:

  • Constant score query
  • Bool query

Constant score query

Elasticsearch supports querying both structured data and full text. While full-text queries need scoring mechanisms to find the best matching documents, structured searches don't need scoring. The constant score query allows us to convert a scoring query which normally runs in query context to a non-scoring filter contextThe constant score query is a very important tool in your toolbox.

For example, the term query is normally run in a query context. That means when Elasticsearch executes a term query, it not only filters the documents but also scores all of them:

GET /amazon_products/products/_search
{
  "query": {
    "term": {
      "manufacturer...

Summary


In this chapter, we took a deep dive into the search capabilities of Elasticsearch. We understood the role of analyzers and the anatomy of an analyzer. We have seen how to use some of the built-in analyzers that come with Elasticsearch, and we have also seen how to create custom analyzers. Along with a solid background regarding analyzers, we learnt about two main types of queries—term-level queries and full-text queries. We also understood how to compose different queries into more complex queries using one of the compound queries.

This chapter provided you with sound knowledge to get a foothold for querying Elasticsearch data. There are many more types of queries supported by Elasticsearch, but we have covered most essential ones. This should help you get started and help you understand other types of queries from the Elasticsearch reference documentation.

In Chapter 4Analytics with Elasticsearch, we will learn about the analytics capabilities of Elasticsearch. With that chapter...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning Elastic Stack 6.0
Published in: Dec 2017Publisher: PacktISBN-13: 9781787281868
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Pranav Shukla

Pranav Shukla is the founder and CEO of Valens DataLabs, a technologist, husband, and father of two. He is a big data architect and software craftsman who uses JVM-based languages. Pranav has diverse experience of over 14 years in architecting enterprise applications for Fortune 500 companies and start-ups. His core expertise lies in building JVM-based, scalable, reactive, and data-driven applications using Java/Scala, the Hadoop ecosystem, Apache Spark, and NoSQL databases. He is a big data engineering, analytics, and machine learning enthusiast.
Read more about Pranav Shukla

author image
Sharath Kumar M N

Sharath Kumar M N did his master's in computer science at the University of Texas, Dallas, USA. He is currently working as a senior principal architect at Broadcom. Prior to this, he was working as an Elasticsearch solutions architect at Oracle. He has given several tech talks at conferences such as Oracle Code events. Sharath is a certified trainer Elastic Certified Instructor one of the few technology experts in the world who has been certified by Elastic Inc. to deliver their official from the creators of Elastic training. He is also a data science and machine learning enthusiast. In his free time, he likes playing with his lovely niece, Monisha; nephew, Chirayu; and his pet, Milo.
Read more about Sharath Kumar M N