Reader small image

You're reading from  Vector Search for Practitioners with Elastic

Product typeBook
Published inNov 2023
PublisherPackt
ISBN-139781805121022
Edition1st Edition
Right arrow
Authors (2):
Bahaaldine Azarmi
Bahaaldine Azarmi
author image
Bahaaldine Azarmi

Bahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.
Read more about Bahaaldine Azarmi

Jeff Vestal
Jeff Vestal
author image
Jeff Vestal

Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.
Read more about Jeff Vestal

View More author details
Right arrow

ML node capacity

For embedding models run on ML nodes in Elasticsearch, you will need to plan to ensure your nodes have enough capacity to run the model at inference time. Elastic Cloud allows the auto-scaling of ML nodes based on CPU requirements, which allows them to scale up and out when more compute is required and scale down when those requirements are reduced.

We cover tuning ML nodes for inference in the next chapter in more detail, but minimally, you will need an ML node with enough RAM to load at least one instance of the embedding model. As your performance requirements increase, you can increase the number of allocations of the individual model as well as the number of threads allocated per allocation.

To check the size of a model and the amount of memory (RAM) required to load the model, you can run the get trained models statistics API (for more information on this API, visit the documentation page at https://www.elastic.co/guide/en/elasticsearch/reference/current...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Vector Search for Practitioners with Elastic
Published in: Nov 2023Publisher: PacktISBN-13: 9781805121022

Authors (2)

author image
Bahaaldine Azarmi

Bahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.
Read more about Bahaaldine Azarmi

author image
Jeff Vestal

Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.
Read more about Jeff Vestal