Reader small image

You're reading from  Exploring Deepfakes

Product typeBook
Published inMar 2023
Reading LevelBeginner
PublisherPackt
ISBN-139781801810692
Edition1st Edition
Languages
Right arrow
Authors (2):
Bryan Lyon
Bryan Lyon
author image
Bryan Lyon

Bryan Lyon is a developer for Faceswap.
Read more about Bryan Lyon

Matt Tora
Matt Tora
author image
Matt Tora

Matt Tora is a developer for Faceswap.
Read more about Matt Tora

View More author details
Right arrow

The Deepfake Workflow

Creating a deepfake is an involved process. The tools within the various software applications help to significantly reduce the amount of manual work required; however, they do not eliminate this requirement entirely. Most of this manual work involves collecting and curating source material, as well as cleaning up data for the final swap.

Whilst there are various applications available for creating deepfakes this chapter will use the open source software Faceswap (http://www.Faceswap.dev). The general workflow for creating a deepfake is the same from application to application, but you will find the nuances and available options vary between packages.

It is also worth noting that Faceswap, at its core, is a command-line application. However, it also comes with a GUI that acts as a wrapper to launch the various processes. Within this chapter, the GUI will be used to illustrate the workflow; however, most of the tasks performed here can also be run from the...

Technical requirements

As with all machine learning techniques, deepfakes can be created on any PC with a minimum of 4 GB of RAM. However, a machine with 8 GB of RAM or higher and a GPU (a graphics card) is strongly recommended. Training a model on a CPU is likely to take months to complete, which does not make it a realistic endeavor. Graphics cards are built specifically to perform matrix calculations, which makes them ideal for machine learning tasks.

Faceswap will run on Linux, Windows, and Intel-based macOS systems. At a minimum, Faceswap should be run on a system with 4 GB of VRAM (GPU memory). Ideally, an NVIDIA GPU should be used, as AMD GPUs are not as fully featured as their Nvidia counterparts and run considerably slower. Some features that are available for NVIDIA users are not available for AMD users, due to NVIDIA’s proprietary CUDA library being accepted as an industry standard for machine learning. GPUs with more VRAM will be able to run more of the larger...

Identifying suitable candidates for a swap

While it is technically possible to swap any face with another, creating a convincing deepfake requires paying some attention to the attributes of your source and destination faces. Depending on what you hope to achieve from your deepfake, this may be more or less important to you, but assuming that you wish to create a convincing swap, you should pay attention to the following attributes.

  • Face/head shape: Are the shapes of the faces similar to one another? If one face is quite narrow and the other quite round, then while the facial features will be correct, the final swap is unlikely to be particularly convincing if the final swap contains a head shape that is significantly different from the individual you are attempting to target.
  • Hairline/hairstyles: While it is possible to do full head swaps, these are generally harder to pull off, as hair is complex, and hairstyles can change significantly. You will generally be swapping...

Preparing the training images

In this section, we will be collecting, extracting, and curating the images to train our model. Far and away the best sources for collecting face data are video files. Videos are just a series of still images, but as you can obtain 25 still images for every second of video in a standard 25 FPS file, they are a valuable and plentiful resource. Video is also likely to contain a lot more natural and varied poses than photographs, which tend to be posed and contain limited expressions.

Video sources should be of a high quality. The absolute best source of data is HD content encoded at a high bitrate. You should be wary of video content acquired from online streaming platforms, as these tend to be of a low bitrate, even if the resolution is high. For similar reasons, JPEG images can also be problematic. The neural network will learn to recreate what it sees, and this will include learning compression artifacts from low-bitrate/highly compressed sources....

Training a model

This part of the process requires the least amount of manual intervention but will take the longest in terms of compute time. Depending on the model chosen and the hardware in use, this can take anywhere from 12 hours to several weeks to complete.

It is advised to use a relatively lightweight model when creating a deepfake for the first time. Creating swaps is fairly nuanced, and understanding what works and what doesn’t comes with experience. Whilst Faceswap offers several models, starting with the Original or Lightweight model will allow you to gauge the performance of the swap relatively quickly, while not necessarily giving you the best possible final result.

Faceswap comes with numerous configurable settings for models and training. These are available within the Settings menu of the application. To cover all of these settings is well outside of the scope of this walk-through, so default settings will be used unless otherwise stated.

Setting up...

Applying a trained model to perform a swap

Once the model has completed training, it can be used to swap the faces on any video to that contains the individual that is to be swapped out. Three items are required to successfully perform a swap – a video/series of images, a trained model, and an alignments file for the media that is to be converted. The first two items are self-explanatory; the alignments file is the one item we need to create.

The alignments file

The alignments file is a file bespoke to Faceswap, with a .fsa extension. This file should exist for every media source that is to be converted. It contains information about the location of faces within a video file, the alignment information (how the faces are orientated within each frame), as well as any associated masks for each frame.

Generating an alignments file is fairly trivial. In fact, at least one has been generated already when we built a training set. The process for generating training data and...

Summary

In this chapter, we learned the workflow required to create a deepfake using the open source Faceswap software. The importance of data variety was discussed and the steps required to acquire, curate and generate face sets were demonstrated. We learned how to train a model within Faceswap, and how to gauge when a model has been fully trained, as well as learned some tricks to improve the quality of the model. Finally, we learned how to take our trained model and apply it to a source video to swap the faces within the video.

In the next chapter, we will begin to take a hands-on look at the neural networks available to build a deepfake pipeline from scratch using the PyTorch ML toolkit, starting with the models available for detecting and extracting faces from source images.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Exploring Deepfakes
Published in: Mar 2023Publisher: PacktISBN-13: 9781801810692
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Bryan Lyon

Bryan Lyon is a developer for Faceswap.
Read more about Bryan Lyon

author image
Matt Tora

Matt Tora is a developer for Faceswap.
Read more about Matt Tora