Reader small image

You're reading from  Learning Elasticsearch

Product typeBook
Published inJun 2017
PublisherPackt
ISBN-139781787128453
Edition1st Edition
Right arrow
Author (1)
Abhishek Andhavarapu
Abhishek Andhavarapu
author image
Abhishek Andhavarapu

Abhishek Andhavarapu is a software engineer at eBay who enjoys working on highly scalable distributed systems. He has a master's degree in Distributed Computing and has worked on multiple enterprise Elasticsearch applications, which are currently serving hundreds of millions of requests per day. He began his journey with Elasticsearch in 2012 to build an analytics engine to power dashboards and quickly realized that Elasticsearch is like nothing out there for search and analytics. He has been a strong advocate since then and wrote this book to share the practical knowledge he gained along the way.
Read more about Abhishek Andhavarapu

Right arrow

Different types of queries

Elasticsearch queries are executed using the Search API. Like anything else in Elasticsearch, request and response are represented in JSON.

Queries in Elasticsearch at a high level are divided as follows:

  • Structured queries: Structured queries are used to query numbers, dates, statuses, and so on. These are similar to queries supported by a SQL database. For example, whether a number or date falls within a range or to find all the employees with John as the first name and so on
  • Full-text search queries: Full-text search queries are used to search text fields. When you send a full-text query to Elasticsearch, it first finds all the documents that match the query, and then the documents are ranked based on how relevant each document is to the query. We will discuss relevance in detail in the Relevance section

Both structured and full-text search queries...

Sample data

To better explain the various concepts in this chapter, we will use the e-commerce site as an example. We will create an index with a list of products. This will be a very simple index called chapter6 with type called product. The mapping for the product type is shown here:

 #Delete existing index if any
DELETE chapter6

#Mapping
PUT chapter6
{
"settings": {},
"mappings": {
"product": {
"properties": {
"product_name": {
"type": "text",
"analyzer": "english"
},
"description" : {
"type": "text",
"analyzer": "english"
}
}
}
}
}

For the product_name and description fields, the English analyzer will be used instead of the default standard...

Querying Elasticsearch

One of most powerful features of Elasticsearch is the Query DSL (Domain specific Language) or the query language. The query language is very expressive and can be used to define filters, queries, sorting, pagination, and aggregations in the same query. To execute a search query, an HTTP request should be sent to the _search endpoint. The index and type on which the query should be executed is specified in the URL. Index and type are optional. If no index/type is specified, Elasticsearch executes the request across all the indexes in the cluster. A search query in Elasticsearch can be executed in two different ways:

  • By passing the search request as query parameters.
  • By passing the search request in the request body.

A simple search query using query parameters is shown here:

GET chapter6/product/_search?q=product_name:jacket

Simple queries can be executed...

Relevance

A traditional database usually contains structured data. A query on a database limits the data depending on different conditions specified by the user. Each condition in the query is evaluated as true/false, and the rows that don't satisfy the conditions are eliminated. However, full-text search is much more complicated. The data is unstructured, or at least the queries are.

We often need to search for the same text across one or more fields. The documents can be quite large, and the query word might appear multiple times in the same document and across several documents. Displaying all the results of the search will not help as there could be hundreds, if not more, and most documents might not even be relevant to the search.

To solve this problem, all the documents that match the query are assigned a score. The score is assigned based on how relevant each document...

Searching for same value across multiple fields

The multi_match query is used to match the same value across multiple fields. When a user searches for biking jacket, searching just the product_name field might not find any matches. To widen the search, we should most probably also search the description field along with the product_name field. The document that contains both biking and jacket is shown here:

 {
"product_name": "Men's Water Resistant Jacket",
"description": "Provides comfort during biking and hiking",
"unit_price": 69.99,
"reviews": 5,
"release_date": "2017-03-02"
}

Scoring based on a single field is pretty straightforward, scoring based on multiple fields gets tricky. We can't use a match query with an operator as the terms biking and jacket don't exist in...

Caching

In Elasticsearch 5.0, a lot of refactoring has been done to support better caching. The different types of cache available are as follows:

  • Node Query cache: Queries that run in filter context are cached here
  • Shard request cache: The results of the entire query are cached here

Node Query cache

Queries, such as numeric or date range, which run in the filter context are great candidates for caching. Since they have no scoring phase, they can be reused. The Node query cache is a smart cache; you do not have to worry about invalidating the cache. Individual queries that run in filter context are cached here. This cache is maintained at a node level and defaults to 10% of the heap and can be configured using the elasticsearch...

Summary

In this chapter, you learned how to query Elasticsearch. We discussed the differences between structured queries and full-text search. We also discussed how to combine different queries using bool query. We learned what relevance means and how it is calculated. We used factors such as price and release date to tune the relevance score.

In the next chapter, we will discuss more advanced features, such as location-based filtering, autocomplete, making suggestions based on the user query, and more.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learning Elasticsearch
Published in: Jun 2017Publisher: PacktISBN-13: 9781787128453
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Abhishek Andhavarapu

Abhishek Andhavarapu is a software engineer at eBay who enjoys working on highly scalable distributed systems. He has a master's degree in Distributed Computing and has worked on multiple enterprise Elasticsearch applications, which are currently serving hundreds of millions of requests per day. He began his journey with Elasticsearch in 2012 to build an analytics engine to power dashboards and quickly realized that Elasticsearch is like nothing out there for search and analytics. He has been a strong advocate since then and wrote this book to share the practical knowledge he gained along the way.
Read more about Abhishek Andhavarapu