Reader small image

You're reading from  Learn Python by Building Data Science Applications

Product typeBook
Published inAug 2019
Reading LevelIntermediate
PublisherPackt
ISBN-139781789535365
Edition1st Edition
Languages
Tools
Right arrow
Authors (2):
Philipp Kats
Philipp Kats
author image
Philipp Kats

Philipp Kats is a researcher at the Urban Complexity Lab, NYU CUSP, a research fellow at Kazan Federal University, and a data scientist at StreetEasy, with many years of experience in software development. His interests include data analysis, urban studies, data journalism, and visualization. Having a bachelor's degree in architectural design and a having followed the rocky path (at first) of being a self-taught developer, Philipp knows the pain points of learning programming and is eager to share his experience.
Read more about Philipp Kats

David Katz
David Katz
author image
David Katz

David Katz is a researcher and holds a Ph.D. in mathematics. As a mathematician at heart, he sees code as a tool to express his questions. David believes that code literacy is essential as it applies to most disciplines and professions. David is passionate about sharing his knowledge and has 6 years of experience teaching college and high school students.
Read more about David Katz

View More author details
Right arrow

Serverless API Using Chalice

In the previous chapter, we created a REST API that served our prediction model forecasts by managing our own server and application. While that approach is by far the most popular, there is another that is also very useful for specific tasks—using serverless applications.

In this chapter, we will use the Chalice Python package to build an API that's similar to the one we built in Chapter 18, Serving Models with a RESTful API, but it will run in the cloud as a serverless application. Along the way, we will discuss along the way all the pros and cons of this approach.

In this chapter, we will learn about the following:

  • What a serverless application is
  • How to build a simple application using the Chalice package
  • How to mitigate Chalice's limitations
  • Scheduling a serverless process
  • Deploying a larger-than-limit application with Zappa
  • ...

Technical requirements

Understanding serverless

The word "serverless" might be somewhat misleading—serverless applications still do run on servers. There is a major difference is responsibility zones, though. With serverless, we don't rent computers and deploy our own APIs; instead, we send Python (or JavaScript, or Go, or whatever else) functions, along with our requirements, to a provider (which could be Amazon Web Services (AWS), Google Cloud Platform, or something else), and they execute those functions on their servers when triggered to do so. We don't need to think about configuring servers, turning them on and off, or scaling—the functions we trigger will work when needed on the scale that is needed (the providers will add computers, if required, behind the scenes). The best part? We'll only pay for the fact of execution—if a function wasn't...

Getting started with Chalice

Let's try replicating the API endpoint we did for 311 in the previous chapter as a serverless application. For that, we'll use a framework called Chalice.

Chalice is a Python package for serverless applications on AWS, and is itself developed by Amazon. It can take care of an application, from its template all the way to deployment. It's also great for testing, as it emulates deployment with no fee, authentication, or even internet connection required.

Before we start working on our serverless application, let's ask Chalice to generate a template. In your terminal, type this:

chalice new-project

After this, type the name of the project: 311estimate. This will generate a new folder with a few files:

311estimate/
|
├── .chalice/
│   └── config.json
|
├── .gitignore...

Setting up a simple model

Similar to how we built a REST API in Chapter 18, Serving Models with a RESTful API, let's start by serving median values from a JSON file. This will help us to set the model for working with Chalice:

  1. First of all, we need to load the JSON object:
import json

with open('./model.json', 'r') as f:
model = json.load(f)
  1. Now we will rename the route and define the last resource to map to the complaint type (in the same way we would for FastAPI, again!). We will also have to import a Response object:
from chalice import Response

@app.route('/predict/{complaint_type}', methods=['GET'])
def predict(complaint_type:str) -> Response:

  1. Finally, finalize the function by adding simple lookup logic; here, we decided to be nice and let our user know if they pass a wrong complaint type:
@app.route('/predict...

Building a serverless API for an ML model

Getting public access to data in 10 lines of code is useful. But let's now do something more complex than that—say, serving an actual ML model.

Let's create one more app—311predictions. As before, we would need to call chalice new-project and type our new project's name.

Now, for the previous application, we didn't need any dependencies; in order to serve the ML model we used in the previous chapter, we need to have pandas and sklearn. The problem is that both of them cannot fit into the 50 MB limitation. In fact, until recently, there was no easy way to fit either of them there—normal pip install requires all the source code to be downloaded and compiled on the machine. Luckily, now a pre-compiled version can be installed, and chalice will explicitly look for a pre-compiled binary, generated for...

Building a serverless function as a data pipeline

So far, we have only used serverless functions as API endpoints, but they can serve in many other ways as well. For example, they can be triggered to run for each new file uploaded to a specific folder on S3, or scheduled to run at a specific time.

Let's create one more application for data collection. We can specify that we need the requests library in requirements.txt. We can also copy and paste the _get_data function from Chapter 15, Packaging and Testing with Poetry and PyTest, along with the resource and time columns. One part of the code that we are still missing is that for uploading data to S3. Here is the code:

def _upload_json(obj, filename, bucket, key):
S3 = boto3.client('s3', region_name='us-east-1')
key += ('/' + filename)

S3.Object(Bucket=bucket, Key=key).put(Body=json...

Summary

In this chapter, we introduced you to serverless functions—a different approach to APIs and computation in general. Serverless functions don't need maintenance, scale automatically, are secure, and are simple to write. They may be a great option for operations that don't need a huge amount of requests, or for when demand spikes unpredictably. In addition to serving as APIs, lambdas can be scheduled with one line of code or triggered by an external event, such as a new file landing in an S3 bucket. The downside of serverless applications is that they have strict memory limitations that could be a serious barrier for certain tasks. The response time could also be longer for the first time after a long break—but there are ways to solve that issue to some extent.

As a practice exercise, we were able to recreate our 311 API endpoints as serverless applications...

Questions

  1. What does a serverless application mean?
  2. What are the limitations of the serverless approach?
  3. What are the benefits of serverless APIs?
  4. What role does Chalice play in the development of a serverless application?

Further reading

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Learn Python by Building Data Science Applications
Published in: Aug 2019Publisher: PacktISBN-13: 9781789535365
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Philipp Kats

Philipp Kats is a researcher at the Urban Complexity Lab, NYU CUSP, a research fellow at Kazan Federal University, and a data scientist at StreetEasy, with many years of experience in software development. His interests include data analysis, urban studies, data journalism, and visualization. Having a bachelor's degree in architectural design and a having followed the rocky path (at first) of being a self-taught developer, Philipp knows the pain points of learning programming and is eager to share his experience.
Read more about Philipp Kats

author image
David Katz

David Katz is a researcher and holds a Ph.D. in mathematics. As a mathematician at heart, he sees code as a tool to express his questions. David believes that code literacy is essential as it applies to most disciplines and professions. David is passionate about sharing his knowledge and has 6 years of experience teaching college and high school students.
Read more about David Katz