Reader small image

You're reading from  Machine Learning Engineering on AWS

Product typeBook
Published inOct 2022
PublisherPackt
ISBN-139781803247595
Edition1st Edition
Tools
Right arrow
Author (1)
Joshua Arvin Lat
Joshua Arvin Lat
author image
Joshua Arvin Lat

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO for three Australian-owned companies and as director of software development and engineering for multiple e-commerce start-ups in the past. Years ago, he and his team won first place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and has shared his knowledge at several international conferences, discussing practical strategies on machine learning, engineering, security, and management.
Read more about Joshua Arvin Lat

Right arrow

Deploying a pre-trained model to an asynchronous inference endpoint

In addition to real-time and serverless inference endpoints, SageMaker also offers a third option when deploying models – asynchronous inference endpoints. Why is it called asynchronous? For one thing, instead of expecting the results to be available immediately, requests are queued, and results are made available asynchronously. This works for ML requirements that involve one or more of the following:

  • Large input payloads (up to 1 GB)
  • A long prediction processing duration (up to 15 minutes)

A good use case for asynchronous inference endpoints would be for ML models that are used to detect objects in large video files (which may take more than 60 seconds to complete). In this case, an inference may take a few minutes instead of a few seconds.

How do we use asynchronous inference endpoints? To invoke an asynchronous inference endpoint, we do the following:

  1. The request payload is...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Machine Learning Engineering on AWS
Published in: Oct 2022Publisher: PacktISBN-13: 9781803247595

Author (1)

author image
Joshua Arvin Lat

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO for three Australian-owned companies and as director of software development and engineering for multiple e-commerce start-ups in the past. Years ago, he and his team won first place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and has shared his knowledge at several international conferences, discussing practical strategies on machine learning, engineering, security, and management.
Read more about Joshua Arvin Lat