Reader small image

You're reading from  Vector Search for Practitioners with Elastic

Product typeBook
Published inNov 2023
PublisherPackt
ISBN-139781805121022
Edition1st Edition
Right arrow
Authors (2):
Bahaaldine Azarmi
Bahaaldine Azarmi
author image
Bahaaldine Azarmi

Bahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.
Read more about Bahaaldine Azarmi

Jeff Vestal
Jeff Vestal
author image
Jeff Vestal

Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.
Read more about Jeff Vestal

View More author details
Right arrow

Getting Started with Vector Search in Elastic

Welcome to Getting Started with Vector Search in Elastic. In this chapter, we will understand the fundamental paradigm of search with Elastic and how vector search has emerged as a powerful tool for real-time, context-aware, accurate information retrieval.

In this chapter, we are going to cover the following topics:

  • Search experience in Elastic before the addition of vector search
  • The need for new representations such as vectors and Hierarchical Navigable Small World (HNSW)
  • A new vector data type
  • Different strategies to configure the mapping and challenges of storing vectors, getting a better perspective on how to optimize its implementation
  • How to build queries including brute force, k-nearest neighbors (kNN), and exact match as a resource

Whether you are a seasoned Elastic user or just getting started, this chapter will provide valuable insights into the power of vector search in Elastic. Let’s...

Search experience in Elastic before vectors

Before the introduction of vector search in Elastic, the primary relevancy model was based on text search and analysis capabilities. Elasticsearch provides various data types (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html) and analyzers (https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-analyzers.html) to provide efficient search. In this part, we will level-set and make sure we all have an understanding of what the “before state” looks like.

Data type and its impact on relevancy

There are various data types in Elasticsearch, but for the purpose of this part, it wouldn’t be useful to go through them one by one. Instead, we will divide them into two categories: the type that directly drives the relevancy ranking and the types that indirectly influence the ranking. The goal is to understand how they are related to relevancy models.

The first type that...

Evolution of search experience

We are now going to see how users’ demand for a better search experience requires us to consider other techniques than just keyword-based search. In this section, we will approach the limitations of keyword-based search, understand what vector representation entails, and how the meta representation HNSW emerged to facilitate information retrieval with vector.

The limits of keyword-based search

For those of you who are comparatively new to the subject matter, before we talk about vector representation, we need to understand why the industry and keyword-based search experience have reached their limits, failing to fully meet end-user requirements.

Keyword-based search relies on exact matches between the user query and the terms contained in documents, which could lead to missed relevant results if the search system is not refined enough with synonyms, abbreviations, alternative phrasings, and so on. Therefore, it is important for the search...

The new vector data type and the vector search query API

At this point of the chapter, you should have a good understanding of relevancy ranking in Elasticsearch and how a vector extends the capabilities of search in domains that keyword-based search couldn’t even compete with. We have also covered how vectors are organized into an HNSW graph, stored in memory in Elasticsearch, and the options to evaluate the distance between vectors. Now, we are going to take this knowledge and put it into action by understanding the dense vector data type available in Elasticsearch, setting up our Elastic Cloud environment, and finally, building and running vector search queries.

Sparse and dense vectors

Elasticsearch supports a new data type as part of the mapping called dense_vector. It is used to store arrays of numeric values. These arrays are vector representation of text semantic. Ultimately, dense vectors are leveraged in the context of vector search and kNN search.

The documentation...

Summary

At this stage of the book, you should have a pretty good understanding of the fundamentals of vector search, including vector representation, how vectors are organized in an HNSW graph, and the method to calculate similarity between vectors. In addition, we have seen how to set up your Elastic Cloud environment as well as your Elasticsearch mapping to run Vector Search queries and leverage the k-nearest neighbors algorithm.

Now, you are equipped with the fundamental knowledge to explore all the subsequent chapters. We’ll discover vector search domains of applications in various code examples and fields such as observability and security.

In the following chapter, we will go a step further – we’ll not only learn how to host a model and generate vectors within Elasticsearch, as opposed to handling it externally, but also explore the intricacies of managing it at different scales and optimizing a deployment from a resource standpoint.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Vector Search for Practitioners with Elastic
Published in: Nov 2023Publisher: PacktISBN-13: 9781805121022
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Bahaaldine Azarmi

Bahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.
Read more about Bahaaldine Azarmi

author image
Jeff Vestal

Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.
Read more about Jeff Vestal