Attacking image models
In this section, we will look at two popular attacks on image classification systems: Fast Gradient Sign Method (FGSM) and the Projected Gradient Descent (PGD) method. We will first look at the theoretical concepts underlying each attack, followed by actual implementation in Python.
FGSM
FGSM is one of the earliest methods used for crafting adversarial examples for image classification models. Proposed by Goodfellow in 2014, it is a simple and powerful attack against neural network (NN)-based image classifiers.
FGSM working
Recall that NNs are layers of neurons placed one after the other, and there are connections from neurons in one layer to the next. Each connection has an associated weight, and the weights represent the model parameters. The final layer produces an output that can be compared with the available ground truth to calculate the loss, which is a measure of how far off the prediction is from the actual ground truth. The loss is backpropagated...