Reader small image

You're reading from  Building Data Science Applications with FastAPI - Second Edition

Product typeBook
Published inJul 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781837632749
Edition2nd Edition
Languages
Tools
Concepts
Right arrow
Author (1)
François Voron
François Voron
author image
François Voron

François Voron graduated from the University of Saint-Étienne (France) and the University of Alicante (Spain) with a master's degree in machine learning and data mining. A full stack web developer and a data scientist, François has a proven track record working in the SaaS industry, with a special focus on Python backends and REST APIs. He is also the creator and maintainer of FastAPI Users, the #1 authentication library for FastAPI, and is one of the top experts in the FastAPI community.
Read more about François Voron

Right arrow

Creating a Distributed Text-to-Image AI System Using the Stable Diffusion Model

Until now, in this book, we’ve built APIs where all the operations were computed inside the request handling. Said another way, before they could get their response, the user had to wait for the server to do everything we had defined: request validation, database queries, ML predictions, and so on. However, this behavior is not always desired or possible.

A typical example is email notifications. It happens quite often in a web application that we need to send an email to the user because they just registered or they performed a specific action. To do this, the server needs to send a request to an email server so the email can be sent. This operation could take a few milliseconds. If we do this inside the request handling, the response will be delayed until we send the email. This is not a very good experience since the user doesn’t really care how and when the email is sent. This example...

Technical requirements

For this chapter, you’ll require a Python virtual environment, just as we set up in Chapter 1, Python Development Environment Setup.

To run the Stable Diffusion model correctly, we recommend you have a recent computer equipped with at least 16 GB of RAM and, ideally, a dedicated GPU with 8 GB of VRAM. For Mac users, recent models equipped with the M1 Pro or M2 Pro chips are also a good fit. If you don’t have that kind of machine, don’t worry: we’ll show you ways to run the system anyway – the only drawback is that image generation will be slow and show poor results.

For running the worker, you’ll need a running Redis server on your local computer. The easiest way is to run it as a Docker container. If you’ve never used Docker before, we recommend you read the Getting started tutorial in the official documentation at https://docs.docker.com/get-started/. Once done, you’ll be able to run a Redis server...

Generating images from text prompts with Stable Diffusion

Recently, a new generation of AI tools has emerged and fascinated the whole world: image-generation models, such as DALL-E or Midjourney. Those models are trained on huge amounts of image data and are able to generate completely new images from a simple text prompt. These AI models are very good use cases for background workers: they take seconds or even minutes to process, and they need lots of resources in the CPU, RAM, and even the GPU.

To build our system, we’ll rely on Stable Diffusion, a very popular image-generation model that was released in 2022. This model is available publicly and can be run on a modern gaming computer. As we did in the previous chapter, we’ll rely on Hugging Face tools for both downloading the model and running it.

Let’s first install the required tools:

(venv) $ pip install accelerate diffusers

We’re now ready to use diffuser models thanks to Hugging Face.

...

Creating a Dramatiq worker and defining an image-generation task

As we mentioned in the introduction of this chapter, it’s not conceivable to run our image-generation model directly on our REST API server. As we saw in the previous section, the operation can take several minutes and consumes a massive amount of memory. To solve this, we’ll define another process, apart from the server process, that’ll take care of this image-generation task: the worker. In essence, a worker can be any program whose role is to compute a task in the background.

In web development, this concept usually implies a bit more than this. A worker is a process running continuously in the background, waiting for incoming tasks. The tasks are usually sent by the web server, which asks for specific operations given the user actions.

Therefore, we see that we need a communication channel between the web server and the worker. That’s the role of the queue. It’ll accept and...

Storing results in a database and object storage

In the previous section, we showed how to implement a background worker to do the heavy computation and an API to schedule tasks on this worker. However, we are still missing two important aspects: the user doesn’t have any way to know the progress of the task nor to retrieve the final result. Let’s fix this!

Sharing data between the worker and the API

As we’ve seen, the worker is a program running in the background executing the computations the API has asked it to do. However, the worker doesn’t have any way to talk with the API server. That’s expected: since there could be any number of server processes, and since they could even run on different physical servers, processes cannot communicate directly. It’s always the same problem of having a central data source on which processes can write and read data.

Actually, the first approach to solve the lack of communication between the...

Summary

Awesome! You may not have realized it yet, but in this chapter, you learned how to architect and implement a very complex machine learning system that could rival existing image-generation services you see out there. The concepts we showed here are essential and are at the heart of all the distributed systems you could imagine, whether they are designed to run machine learning models, extraction pipelines, or math computations. By using modern tools such as FastAPI and Dramatiq, you’ll be able to implement this kind of architecture in a short time with a minimum amount of code, leading to a very quick and robust result.

We’re near the end of our journey. Before letting you live your own adventures with FastAPI, we’ll study one last important aspect when building data science applications: logging and monitoring.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Building Data Science Applications with FastAPI - Second Edition
Published in: Jul 2023Publisher: PacktISBN-13: 9781837632749
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
François Voron

François Voron graduated from the University of Saint-Étienne (France) and the University of Alicante (Spain) with a master's degree in machine learning and data mining. A full stack web developer and a data scientist, François has a proven track record working in the SaaS industry, with a special focus on Python backends and REST APIs. He is also the creator and maintainer of FastAPI Users, the #1 authentication library for FastAPI, and is one of the top experts in the FastAPI community.
Read more about François Voron