You're reading from Building Data Science Applications with FastAPI - Second Edition

Product type Book

Published in Jul 2023

Publisher Packt

ISBN-13 9781837632749

Pages 422 pages

Edition 2nd Edition

Languages

Python

Concepts

Data Science

Author (1):

François Voron

Table of Contents (21) Chapters

Preface

Part 1: Introduction to Python and FastAPI

Chapter 1: Python Development Environment Setup

Chapter 2: Python Programming Specificities

Chapter 3: Developing a RESTful API with FastAPI

Chapter 4: Managing Pydantic Data Models in FastAPI

Chapter 5: Dependency Injection in FastAPI

Part 2: Building and Deploying a Complete Web Backend with FastAPI

Chapter 6: Databases and Asynchronous ORMs

Chapter 7: Managing Authentication and Security in FastAPI

Chapter 8: Defining WebSockets for Two-Way Interactive Communication in FastAPI

Chapter 9: Testing an API Asynchronously with pytest and HTTPX

Chapter 10: Deploying a FastAPI Project

Part 3: Building Resilient and Distributed Data Science Systems with FastAPI

Chapter 11: Introduction to Data Science in Python

Chapter 12: Creating an Efficient Prediction API Endpoint with FastAPI

Chapter 13: Implementing a Real-Time Object Detection System Using WebSockets with FastAPI

Chapter 14: Creating a Distributed Text-to-Image AI System Using the Stable Diffusion Model

Chapter 15: Monitoring the Health and Performance of a Data Science System

Index

Why subscribe?

Other Books You May Enjoy

Introduction to Data Science in Python

In recent years, Python has gained a lot of popularity in the data science field. Its very efficient and readable syntax makes the language a very good choice for scientific research, while still being suitable for production workloads; it’s very easy to deploy research projects into real applications that will bring value to users. Thanks to this growing interest, a lot of specialized Python libraries have emerged and are now standards in the industry. In this chapter, we’ll introduce the fundamental concepts of machine learning before diving into the Python libraries used daily by data scientists.

In this chapter, we’re going to cover the following main topics:

Understanding the basic concepts of machine learning
Creating and manipulating NumPy arrays and pandas datasets
Training and evaluating machine learning models with scikit-learn

Technical requirements

For this chapter, you’ll require a Python virtual environment, just as we set up in Chapter 1, Python Development Environment Setup.

You’ll find all the code examples for this chapter in the dedicated GitHub repository at https://github.com/PacktPublishing/Building-Data-Science-Applications-with-FastAPI-Second-Edition/tree/main/chapter11.

What is machine learning?

Machine learning (ML) is often seen as a subfield of artificial intelligence. While this categorization is the subject of debate, ML has had a lot of exposure in recent years due to its vast and visible field of applications, such as spam filters, natural language processing, and image generation.

ML is a field where we build mathematical models from existing data so that the machine can understand this data by itself. The machine is “learning” in the sense that the developer doesn’t have to program a step-by-step algorithm to solve the problem, which would be impossible for complex tasks. Once a model has been “trained” on existing data, it can be used to predict new data or understand new observations.

Consider the spam filter example: if we have a sufficiently large collection of emails manually labeled “spam” or “not spam,” we can use ML techniques to build a model that can tell us whether...

Manipulating arrays with NumPy and pandas

As we said in the introduction, numerous Python libraries have been developed to help with common data science tasks. The most fundamental ones are probably NumPy and pandas. Their goal is to provide a set of tools to manipulate a big set of data in an efficient way, much more than what we could actually achieve with standard Python, and we’ll show how and why in this section. NumPy and pandas are at the heart of most data science applications in Python; knowing about them is therefore the first step on your journey into Python for data science.

Before starting to use them, let’s explain why such libraries are needed. In Chapter 2, Python Programming Specificities, we stated that Python is a dynamically typed language. This means that the interpreter automatically detects the type of a variable at runtime, and this type can even change throughout the program. For example, you can do something like this in Python:

$ python...

Training models with scikit-learn

scikit-learn is one of the most widely used Python libraries for data science. It implements dozens of classic ML models, but also numerous tools to help you while training them, such as preprocessing methods and cross-validation. Nowadays, you’ll probably hear about more modern approaches, such as PyTorch, but scikit-learn is still a solid tool for a lot of use cases.

The first thing you must do to get started is to install it in your Python environment:

(venv) $ pip install scikit-learn

We can now start our scikit-learn journey!

Training models and predicting

In scikit-learn, ML models and algorithms are called estimators. Each is a Python class that implements the same methods. In particular, we have fit, which is used to train a model, and predict, which is used to run the trained model on new data.

To try this, we’ll load a sample dataset. scikit-learn comes with a few toy datasets that are very useful for performing...

Summary

Congratulations! You’ve discovered the basic concepts of ML and made your first experiments with the fundamental toolkits of the data scientist. Now, you should be able to explore your first data science problems in Python. Of course, this was by no means a complete lesson on ML: the field is vast and there are tons of algorithms and techniques to explore. However, I hope that this has sparked your curiosity and that you’ll deepen your knowledge of this subject.

Now, it’s time to get back to FastAPI! With our new ML tools at hand, we’ll be able to leverage the power of FastAPI to serve our estimators and propose a reliable and efficient prediction API to our users.