You're reading from Vector Search for Practitioners with Elastic

Product typeBook

Published inNov 2023

PublisherPackt

ISBN-139781805121022

Edition1st Edition

Concepts

Data Analysis

Authors (2):

Bahaaldine Azarmi

Jeff Vestal

View More author details

Getting Started with Vector Search in Elastic

Welcome to Getting Started with Vector Search in Elastic. In this chapter, we will understand the fundamental paradigm of search with Elastic and how vector search has emerged as a powerful tool for real-time, context-aware, accurate information retrieval.

In this chapter, we are going to cover the following topics:

Search experience in Elastic before the addition of vector search
The need for new representations such as vectors and Hierarchical Navigable Small World (HNSW)
A new vector data type
Different strategies to configure the mapping and challenges of storing vectors, getting a better perspective on how to optimize its implementation
How to build queries including brute force, k-nearest neighbors (kNN), and exact match as a resource

Whether you are a seasoned Elastic user or just getting started, this chapter will provide valuable insights into the power of vector search in Elastic. Let’s...

Search experience in Elastic before vectors

Before the introduction of vector search in Elastic, the primary relevancy model was based on text search and analysis capabilities. Elasticsearch provides various data types (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html) and analyzers (https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-analyzers.html) to provide efficient search. In this part, we will level-set and make sure we all have an understanding of what the “before state” looks like.

Data type and its impact on relevancy

There are various data types in Elasticsearch, but for the purpose of this part, it wouldn’t be useful to go through them one by one. Instead, we will divide them into two categories: the type that directly drives the relevancy ranking and the types that indirectly influence the ranking. The goal is to understand how they are related to relevancy models.

The first type that...

Evolution of search experience

We are now going to see how users’ demand for a better search experience requires us to consider other techniques than just keyword-based search. In this section, we will approach the limitations of keyword-based search, understand what vector representation entails, and how the meta representation HNSW emerged to facilitate information retrieval with vector.

The limits of keyword-based search

For those of you who are comparatively new to the subject matter, before we talk about vector representation, we need to understand why the industry and keyword-based search experience have reached their limits, failing to fully meet end-user requirements.

Keyword-based search relies on exact matches between the user query and the terms contained in documents, which could lead to missed relevant results if the search system is not refined enough with synonyms, abbreviations, alternative phrasings, and so on. Therefore, it is important for the search...

The new vector data type and the vector search query API

At this point of the chapter, you should have a good understanding of relevancy ranking in Elasticsearch and how a vector extends the capabilities of search in domains that keyword-based search couldn’t even compete with. We have also covered how vectors are organized into an HNSW graph, stored in memory in Elasticsearch, and the options to evaluate the distance between vectors. Now, we are going to take this knowledge and put it into action by understanding the dense vector data type available in Elasticsearch, setting up our Elastic Cloud environment, and finally, building and running vector search queries.

Sparse and dense vectors

Elasticsearch supports a new data type as part of the mapping called dense_vector. It is used to store arrays of numeric values. These arrays are vector representation of text semantic. Ultimately, dense vectors are leveraged in the context of vector search and kNN search.

The documentation...

Summary

At this stage of the book, you should have a pretty good understanding of the fundamentals of vector search, including vector representation, how vectors are organized in an HNSW graph, and the method to calculate similarity between vectors. In addition, we have seen how to set up your Elastic Cloud environment as well as your Elasticsearch mapping to run Vector Search queries and leverage the k-nearest neighbors algorithm.

Now, you are equipped with the fundamental knowledge to explore all the subsequent chapters. We’ll discover vector search domains of applications in various code examples and fields such as observability and security.

In the following chapter, we will go a step further – we’ll not only learn how to host a model and generate vectors within Elasticsearch, as opposed to handling it externally, but also explore the intricacies of managing it at different scales and optimizing a deployment from a resource standpoint.

The rest of the chapter is locked

You have been reading a chapter from

Vector Search for Practitioners with Elastic

Published in: Nov 2023Publisher: PacktISBN-13: 9781805121022

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Bahaaldine Azarmi

Bahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.
Read more about Bahaaldine Azarmi

Jeff Vestal

Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.
Read more about Jeff Vestal

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages