You're reading from 10 Machine Learning Blueprints You Should Know for Cybersecurity

Product type Book

Published in May 2023

Publisher Packt

ISBN-13 9781804619476

Pages 330 pages

Edition 1st Edition

Languages

Concepts

Machine Learning

Author (1):

Rajvardhan Oak

Table of Contents (15) Chapters

Preface

Chapter 1: On Cybersecurity and Machine Learning

Chapter 2: Detecting Suspicious Activity

Chapter 3: Malware Detection Using Transformers and BERT

Chapter 4: Detecting Fake Reviews

Chapter 5: Detecting Deepfakes

Chapter 6: Detecting Machine-Generated Text

Chapter 7: Attributing Authorship and How to Evade It

Chapter 8: Detecting Fake News with Graph Neural Networks

Chapter 9: Attacking Models with Adversarial Machine Learning

Chapter 10: Protecting User Privacy with Differential Privacy

Chapter 11: Protecting User Privacy with Federated Machine Learning

Chapter 12: Breaking into the Sec-ML Industry

Index

Why subscribe?

Other Books You May Enjoy

Detecting Deepfakes

In recent times, the problem of deepfakes has become prevalent on the internet. Easily accessible technology allows attackers to create images of people who have never existed, through the magic of deep neural networks! These images can be used to enhance fraudulent or bot accounts to provide an illusion of being a real person. As if deepfake images were not enough, deepfake videos are just as easy to create. These videos allow attackers to either morph someone’s face onto a different person in an existing video, or craft a video clip in which a person says something. Deepfakes are a hot research topic and have far-reaching impacts. Abuse of deepfake technology can result in misinformation, identity theft, sexual harassment, and even political crises.

This chapter will focus on machine learning methods to detect deepfakes. First, we will understand the theory behind deepfakes, how they are created, and what their impact can be. We will then cover two approaches...

Technical requirements

You can find the code files for this chapter on GitHub at https://github.com/PacktPublishing/10-Machine-Learning-Blueprints-You-Should-Know-for-Cybersecurity/tree/main/Chapter%205.

All about deepfakes

The word deepfake is a combination of two words – deep learning and fake. Put simply, deepfakes are fake media created using deep learning technology. In the past decade, there have been significant advances in machine learning and generative models – models that create content instead of merely classifying it. These models (such as Generative Adversarial Networks (GANs)) can synthesize images that look real – even of human faces!

Deepfake technology is readily accessible to attackers and malicious actors today. It requires no sophistication or technical skills. As an experiment, head over to the website thispersondoesnotexist.com. This website allows you to generate images of people – people who have never existed!

For example, the people in the following figure are not real. They are deepfakes that have been generated by thispersondoesnotexist.com, and it only took a few seconds!

Figure 5.1 –...

Detecting fake images

In the previous section, we looked at how deepfake images and videos can be generated. As the technology to do so is accessible to everyone, we also discussed the impact that this can have at multiple levels. Now, we will look at how fake images can be detected. This is an important problem to solve and has far-reaching impacts on social media and the internet in general.

A naive model to detect fake images

We know that machine learning has driven significant progress in the domain of image processing. Convolutional neural networks (CNNs) have surpassed prior image detectors and achieved accuracy even greater than that of humans. As a first step toward detecting deepfake images, we will treat the task as a simple binary classification and use standard deep learning image classification approaches.

The dataset

There are several publicly available datasets for deepfake detection. We will use the 140k Real and Fake Faces Dataset. This dataset is freely...

Detecting deepfake videos

As if deepfake images were not enough, deepfake videos are now revolutionizing the internet. From benign uses such as comedy and entertainment to malicious uses such as pornography and political unrest, deepfake videos are taking social media by storm. Because deepfakes appear so realistic, simply looking at a video with the naked eye does not provide any clues as to whether it is real or fake. As a machine learning practitioner working in the security field, it is essential to know how to develop models and techniques to identify deepfake videos.

A video can be thought of as an extension of an image. A video is multiple images arranged one after the other and viewed in quick succession. Each such image is known as a frame. By viewing the frames at a high speed (multiple frames per second), we see images moving.

Neural networks cannot directly process videos – there does not exist an appropriate method to encode images and convert them into a...

Summary

In this chapter, we studied deepfakes, which are synthetic media (images and videos) that are created using deep neural networks. These media often show people in positions that they have not been in and can be used for several nefarious purposes, including misinformation, fraud, and pornography. The impact can be catastrophic; deepfakes can cause political crises and wars, cause widespread panic among the public, facilitate identity theft, and cause defamation and loss of life. After understanding how deepfakes are created, we focused on detecting them. First, we used CNNs to detect deepfake images. Then, we developed a model that parsed deepfake videos into frames and used transfer learning to convert them into vectors, the sequence of which was used for fake or real classification.

Deepfakes are a growing challenge and have tremendous potential for cybercrime. There is a strong demand in the industry for professionals who understand deepfakes, their generation, the social...