Hands-On Serverless Deep Learning with TensorFlow and AWS Lambda

By Rustem Feyzkhanov
    Advance your knowledge in tech with a Packt subscription

  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Beginning with Serverless Computing and AWS Lambda

About this book

One of the main problems with deep learning models is finding the right way to deploy them within the company's IT infrastructure. Serverless architecture changes the rules of the game—instead of thinking about cluster management, scalability, and query processing, it allows us to focus specifically on training the model. This book prepares you to use your own custom-trained models with AWS Lambda to achieve a simplified serverless computing approach without spending much time and money. You will use AWS Services to deploy TensorFlow models without spending hours training and deploying them. You'll learn to deploy with serverless infrastructures, create APIs, process pipelines, and more with the tips included in this book.

By the end of the book, you will have implemented your own project that demonstrates how to use AWS Lambda effectively so as to serve your TensorFlow models in the best possible way.

Publication date:
January 2019


Chapter 1. Beginning with Serverless Computing and AWS Lambda

This book will encourage you to use your own custom-trained models with AWS Lambda and work with a simplified serverless computing approach. Later on, you will implement sample projects that signify the use of AWS Lambda for serving TensorFlow models.

In this chapter, we will discuss serverless deep learning and you will explore why serverless is so popular and the advantages of deploying applications using serverless. Also, you will look into the data science process and how serverless can enable an easy and convenient way to deploy deep learning applications. You will also briefly look into the sample projects that we will make in the forthcoming chapters.

You will also get to learn about the workings of the AWS implementation, including the traditional and serverless ways of deploying deep learning applications.

In particular, we will cover the following topics: 

  • What is serverless computing?
  • Why serverless deep learning? 
  • AWS Lambda function
  • Sample projects

What is serverless computing?

Serverless computing is a type of architecture for which the code execution is managed by a cloud provider, which means that the developers do not have to worry about managing, provisioning and maintaining servers when deploying the code.


Let's discuss the possible ways of application deployment:

  • On-premise deployment, let's you control the entire infrastructure including the hardware. In other words, it means that the application runs on our machine, which you can access physically.
  • Then, you have Infrastructure as a Service (IaaS), which means that you can't access the servers physically, but you control everything that is happening in it.
  • Next, you have Platform as a Service (PaaS), where you don't control the operating system or runtime, but you can control our code and container.
  • Finally, you have Function as a Service (FaaS) which is a serverless model, and the only thing which you control is the code itself. It can significantly enable us to work on different applications.

Why serverless deep learning?

Let's understand why the severless infrastructure is extremely useful for deploying deep learning models in a data science process.

The usual data science process looks like:

  • Business understanding: You need to understand the business needs, which includes defining the objectives and the possible data sources.
  • Data acquisition: You need to look into the data you are planning to use, explore it, and try to find correlations and gaps.
  • Modeling: You start with the selection of most promising features, building the model, and training it.
  • Deployment: You need to operationalize the model and deploy it.
  • Customer acceptance: You can provide the result to the customer and receive feedback.

The preceding points are represented in the following diagram:


Based on the feedback received from the customer, you can update the model and change the way you deploy it. The deployment and customer acceptance phases are iterative in nature. It means that you will have to get feedback from the user as early as possible. To achieve this, our infrastructure for the deployment has to be both simple and scalable at the same time, which can be done with the help of serverless infrastructure used for deploying the deep learning models. 


You need to be aware that our deep learning infrastructure has to be integratable with our existing infrastructure.

Where serverless deep learning works and where it doesn't work?

Serverless deep learning deployment is very scalable, simple, and cheap to start. The downsides of it are the time limitations, CPU limitations, and memory limitations. 

Where serverless deep learning works?

In the following section, you will start by reiterating the advantages of serverless deep learning deployment:

  • It is extremely useful for your project. If you train your model and want to show it to the world, serverless deep learning will allow you to do so without complicated deployment and any upfront costs. AWS also provides free usage per month. It means that a number of invocations in AWS Lambda will be completely free. For example, in the image recognition project, which we will discuss in the following chapters, the number of runs will be about 100,000 times.
  • Serverless deep learning is great for early-stage startups that want to test their business hypotheses as early as possible. Simplicity and scalability allows small teams to start without expertise in AWS. AWS Lambda allows you to calculate the cost per customer in an easy way and to understand your startup cost per user.
  • Serverless deep learning is extremely convenient if you already have an existing infrastructure and want to optimize your costs. Serverless architecture will be a lot simpler and cheaper than the cluster one and it will be easier to maintain. Significantly, it reduces costs since you don't need to retain the unused servers.
  • Serverless deep learning would be extremely useful for cases where you have extremely high loads. For example, a lot of companies struggle to maintain the system in cases where there are 1 million requests during the first minute and zero requests in the next minute. The cluster will either be too large or it will take a certain time to scale. Serverless, on the other hand, has unmatched scalability, which allows the system to work on high load without rolling.

Where serverless deep learning doesn't work?

Deep learning will not work in the following situations:

  • If one of the main features of your system is to provide a real-time response that is a very complex model; for example, if it is the part of an interaction between your user and the system, then the serverless architecture may not be enough for you. AWS Lambda has a cold start delay and the delay for the unloading and loading of the model into the memory. It does work fast, but it may take more than several seconds to run. Speed highly depends on the size and complexity of the model, so this is something you have to test beforehand.
  • Serverless deep learning may fail if your model utilizes a lot of data. AWS Lambda has certain limitations, such as three gigabytes for run and half a gigabyte for the hard disk, which means you either have to optimize your code in terms of memory usage or use the cluster.
  • If your model has requirements on the CPU power or number of cores, then it may not be able to start on Lambda. There are no certain limits that could predict whether your model will or will not be able to start on AWS Lambda, so it is something which you need to test.
  • An extremely complex model may not work well on serverless infrastructure. By complex, we mean larger than 1 or 2 gigabytes. It would take more time to download it from S3, and Lambda may not have enough memory to load it.

These use cases mentioned above have shown us the landscape for the uses of serverless learning and it will help us to make a decision as to whether to use it or not. Finally, in a lot of cases, there isn't a definitive answer and it makes sense to continue testing your model on serverless.

 Now we'll discuss the Lambda function as a serverless model.



Lambda function – AWS implementation of FaaS

In this section, we will discuss the workings of the AWS implementation of FaaS. The Lambda function is an AWS implementation of FaaS. The AWS service keeps Lambda configuration, which is basically code, libraries, and parameters within the service. Once it receives the trigger, it takes the container from the pool and puts the configuration inside the container. Then, it runs the code inside the container with the data from the event trigger. Once the container produces results, the service returns it in the Response.

The following diagram is a representation of the workings of the Lambda function:

Lambda scales automatically for up to 10,000 concurrent executions. Also, Lambda pricing is a pay per use service, so you would only have to pay for each round of Lambda that you use and you don't have to pay when it doesn't run.

The Lambda configuration consists of the following:

  • Code: This is what you want to run within the function. The code needs to have an explicit declaration of the function, which you want for the service to run.
  • Libraries: These enable us to run more complicated processes. You will need to keep them inside the same package as the code itself.
  • Configurations: These are various parameters that dictate how Lambda works.

The main parameters are as follows:

  • The relational memory and time out
  • The runtime (for example, Python or node)
  • Triggers, which we will describe in the next section
  • IAM role, which provides Lambda access to other interval services
  • Environmental parameters, which allow us to customize the input parameter to our code

Lambda triggers

There are various AWS services that can act as a trigger for AWS Lambda, they are:

  • DynamoDB: This enables us to start the Lambda function on each new entry to the database
  • Amazon S3This helps the Lambda function start files in the bucket
  • CloudWatch: This enables us to run Lambda functions according to the shadow (for example, each minute, each day, or only at noon each Thursday)
  • Lex: This starts by looking at what the usual data science process looks like
  • Kinesis, SQS, and SNS: These enable us to start the Lambda function on each object in the event stream

There are a lot of different triggers, which means you can bind Lambda with a lot of different services.

Why deep learning on AWS Lambda?

Here, you will see the advantages of AWS Lambda :

  • Coding on an AWS Lambda is very easy. You will just need the package code and libraries, not the Docker containers. It enables you to start early and deploy the same code, which you would run locally. This is therefore perfect for early stage projects.
  • AWS Lambda is extremely scalable and, more importantly, you don't have to manage the scalability or write separate logic for it because your data science application will be able to easily process a large number of tasks or work with multiple users.
  • AWS Lambda is priced conveniently. You only need to pay for what you're actually using and the price itself is very affordable. For example, for the image recognition model, the cost will be $1 for 20,000 to 30,000 runs.

In the next section you will know the difference between traditional and serverless architecture using Lamda. 


Traditional versus Serverless architecture using Lambda

Let's look at the difference between traditional and serverless architecture. The following diagram represents the deep learning API through traditional architecture:

In the above traditional architecture, you will not only have to handle the cluster itself, but you will also have to handle all the balancing of API requests. Also, you have to manage the Docker container with the code and libraries and find a way to deploy it using the container registry.

You need to have extensive knowledge of AWS to understand this architecture. Although it is not very difficult, it can be a real issue to begin with. You will need to keep in mind that the AWS architecture for deep learning will have static costs.

Let's discuss the serverless implementations of the above application. The following diagram represents the architecture for deep learning using Lambda:

In the preceding diagram, you can see that it looks a lot easier than traditional architecture. you don't need to manage node balance scalability or containers—you just need to put in your coated libraries and Lambda manages everything else. Also, you can make a number of different prototypes with it and you will only need to pay for invocations. This makes Lambda the perfect way for making your deep learning model available to users. In the next section, you briefly be introduced to the projects that you will develop in this book.


Sample projects

In this section, you will cover projects, which you will develop during the course of this book. you will create three projects:

  • Deep learning API
  • Deep learning batch processing
  • Serverless deep learning workflow

Deep learning API

The deep learning API project provides a great hands-on experience since you are able to see results immediately from your browser. You will start with the deep learning API for image recognition. Image recognition is one of the tasks where deep learning shows incredible results, which are impossible to implement using any other approach. you will be using a contemporary, publicly available pre-trained inception model, which is version free. This project will also show you how easy it is to take an open source model and create an API interface on it.

Deep learning batch processing

In the deep learning batch processing project, you will take a closer look at how a lot of companies run deep learning applications nowadays. In this project, you will build deep learning batch processing for image recognition. It will show us how high Lambda scalability allows us to process thousands of prediction drops at the same time.

Serverless deep learning workflow

In the serverless deep learning workflow project, you will highlight the patterns of deep learning models on the serverless infrastructure. You will make a serverless deep learning workflow for image recognition. This project will show you how you can use contemporary deployment techniques using AWS step functions. You will also learn how you can conduct A/B testing of the model during deployment, error handling, and a multistep process. This project will help you to understand the possible applications of serverless deployment and how to apply this knowledge to either your personal project or within your company.





In this chapter, you were introduced to the serverless functions, the AWS implementation, and the services. You looked at the Lambda function, which is an AWS implementation of FaaS. You also covered the workings of the AWS implementation of FaaS. Later, you understood why the serverless infrastructure is extremely useful for deploying deep learning models and the challenges you might face during its deployment. You also compared the traditional and serverless ways of deploying deep learning applications. You looked into the possible scenarios where the serverless deep learning works and where it doesn't work. Finally, you covered the various example projects that you will be learning about during the course of this book.

In the next chapter, you will learn how to work with AWS Lambda and its deployment.

About the Author

  • Rustem Feyzkhanov

    Rustem Feyzkhanov is a machine learning engineer at Instrumental and creates analytical models for the manufacturing industry. He is also passionate about serverless infrastructures and using them to deploy AI. He has ported several packages on AWS Lambda from TensorFlow/Keras/scikit-learn for ML to PhantomJS/Selenium/WRK to carry out web scraping. One app was featured on AWS serverless repo home page.




    Browse publications by this author
Book Title
Unlock this book and the full library for FREE
Start free trial