Reader small image

You're reading from  TensorFlow 2.0 Quick Start Guide

Product typeBook
Published inMar 2019
Reading LevelBeginner
PublisherPackt
ISBN-139781789530759
Edition1st Edition
Languages
Right arrow
Author (1)
Tony Holdroyd
Tony Holdroyd
author image
Tony Holdroyd

Tony Holdroyd's first degree, from Durham University, was in maths and physics. He also has technical qualifications, including MCSD, MCSD.net, and SCJP. He holds an MSc in computer science from London University. He was a senior lecturer in computer science and maths in further education, designing and delivering programming courses in many languages, including C, C+, Java, C#, and SQL. His passion for neural networks stems from research he did for his MSc thesis. He has developed numerous machine learning, neural network, and deep learning applications, and has advised in the media industry on deep learning as applied to image and music processing. Tony lives in Gravesend, Kent, UK, with his wife, Sue McCreeth, who is a renowned musician.
Read more about Tony Holdroyd

Right arrow

Neural Style Transfer Using TensorFlow 2

Neural style transfer is a technique whereby the artistic style of one image is imposed on the content of another image using a neural network, so that what you end up with is a hybrid of the two images. The image you start with is called the content image. The image whose style you impose on the content image is known as the style reference image. Google refers to the transformed image as the input image, which seems confusing (input in the sense that it takes input from two different sources); let's instead refer to it as the hybrid image. So, the hybrid image is the content image with the style of the style reference image imposed on it.

Neural style transfer works by defining two loss functions—one that describes the difference between the content of two images and another that describes the difference in style between two...

Setting up the imports

To use this implementation with your own images, you need to save those images in the ./tmp/nst directory in your downloaded repository, then edit the content_path and style_path paths, shown in the following code.

As usual, the first thing we need to do is to import (and configure) the required modules:

import numpy as np
from PIL import Image
import time
import functools

import matplotlib.pyplot as plt
import matplotlib as mpl
# set things up for images display
mpl.rcParams['figure.figsize'] = (10,10)
mpl.rcParams['axes.grid'] = False

You may need to pip install pillow, which is a fork of PIL. Next comes the TensorFlow modules:

import tensorflow as tf

from tensorflow.keras.preprocessing import image as kp_image
from tensorflow.keras import models
from tensorflow.keras import losses
from tensorflow.keras import layers
from tensorflow.keras import...

Preprocessing the images

The next function loads an image, with a little preprocessing. Image.open() is what's known as a lazy operation. The function finds the file and opens it for reading, but the image data isn't actually read from the file until you try to process it or load the data. The next group of three lines resizes the image, so that the maximum dimension in either direction is 512 (max_dimension) pixels. For example, if the image were 1,024 x 768, scale would be 0.5 (512/1,024), and this would be applied to both dimensions of the image, giving a resized image size of 512 x 384. The Image.ANTIALIAS argument preserves the best quality of the image. Next, the PIL image is converted into a NumPy array using the img_to_array() call (a method of tensorflow.keras.preprocessing).

Finally, to be compatible with later usage, the image needs a batch dimension along...

Viewing the original images

Next, we use calls to the two preceding functions to display our content and style images, remembering that the image pixels need to be of type unsigned 8-bit integer. The plt.subplot(1,2,1) function means use a grid of one row and two columns at position one; plt.subplot(1,2,2) means use a grid of one row and two columns at position two:

channel_means = [103.939, 116.779, 123.68] # means of the BGR channels, for VGG processing

plt.figure(figsize=(10,10))

content_image = load_image(content_path).astype('uint8')
style_image = load_image(style_path).astype('uint8')

plt.subplot(1, 2, 1)
show_image(content_image, 'Content Image')

plt.subplot(1, 2, 2)
show_image(style_image, 'Style Image')

plt.show()

The output is shown in the following screenshot:

There follows a function to load the image. As we are going to use this, as...

Using the VGG19 architecture

The best way to understand the next snippet is to have a look at the VGG19 architecture. Here is a good place: https://github.com/fchollet/deep-learning-models/blob/master/vgg19.py (about half way down the page).

Here, you will see that VGG19 is a fairly straightforward architecture, consisting of blocks of convolutional layers with a max pooling layer at the end of each block.

For the content layer, we use the second convolutional layer in block5. This highest block is used because the earlier blocks have feature maps more representative of individual pixels; higher layers in the network capture the high-level content in terms of objects and their arrangement in the input image, but do not constrain the actual exact pixel values of the reconstruction (see Gatys et al, 2015, https://arxiv.org/abs/1508.06576, cited previously).

For the style layers...

Creating the model

There now follows a series of functions leading eventually up to the main function that performs the style transfer (run_style_transfer()).

The first function in this sequence, get_model(), creates the model we are going to use.

It first loads trained vgg_model (which has been trained on ImageNet) without its classification layer (include_top=False). Next, it freezes the loaded model (vgg_model.trainable = False).

The style and content layer output values are then acquired using list comprehensions, which iterate over the names of the layers that we specified in the previous section.

These output values are then used, together with the VGG input to create our new model with access to VGG layers, that is, get_model() returns a Keras model that outputs the style and content intermediate layers of the trained VGG19 model. It is unnecessary to use the top layer...

Calculating the losses

We now need the losses between the contents and styles of the two images. We will be using the mean squared loss as follows. Notice here that the subtraction in image1 - image2 is element-wise between the two image arrays. This subtraction works because the images have been resized to the same size in load_image:

def rms_loss(image1,image2):
loss = tf.reduce_mean(input_tensor=tf.square(image1 - image2))
return loss

So next, we define our content_loss function. This is just the mean squared difference between what is named content and target in the function signature:

def content_loss(content, target):
return rms_loss(content, target)

The style loss is defined in terms of a quantity called a Gram matrix. A Gram matrix, also known as the metric, is the dot product of the style matrix with its own transpose. Since this means that each column of the image...

Performing the style transfer

The function that performs style_transfer is quite long so we will present it in sections. Its signature is as follows:

def run_style_transfer(content_path,
style_path,
number_of_iterations=1000,
content_weight=1e3,
style_weight=1e-2):

Since we don't want to actually train any layers in our model, just use the output values from the layers as described previously; we set their trainable properties accordingly:

model = get_model() 
for layer in model.layers:
layer.trainable = False

Next, we get the style_features and content_features representations from the layers of our model, using the function previously defined:

style_features, content_features = get_feature_representations(model, content_path, style_path)

And gram_style_features, using a loop over style_features...

Final displays

Finally, we have a function that displays the content and style images together with best_image:

def show_results(best_image, content_path, style_path, show_large_final=True):
plt.figure(figsize=(10, 5))
content = load_image(content_path)
style = load_image(style_path)

plt.subplot(1, 2, 1)
show_image(content, 'Content Image')

plt.subplot(1, 2, 2)
show_image(style, 'Style Image')

if show_large_final:
plt.figure(figsize=(10, 10))

plt.imshow(best_image)
plt.title('Output Image')
plt.show()

This is followed by a call to that function, as follows:

show_results(best_image, content_path, style_path)

Summary

That concludes our look at neural style transfer. We saw how to take a content image and a style image and produce a hybrid image. We used layers from the trained VGG19 model to accomplish this.

In the next chapter, we will examine recurrent neural networks; these are networks that can process sequential input values, and where both or either of the input values and output values are of variable length.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
TensorFlow 2.0 Quick Start Guide
Published in: Mar 2019Publisher: PacktISBN-13: 9781789530759
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Tony Holdroyd

Tony Holdroyd's first degree, from Durham University, was in maths and physics. He also has technical qualifications, including MCSD, MCSD.net, and SCJP. He holds an MSc in computer science from London University. He was a senior lecturer in computer science and maths in further education, designing and delivering programming courses in many languages, including C, C+, Java, C#, and SQL. His passion for neural networks stems from research he did for his MSc thesis. He has developed numerous machine learning, neural network, and deep learning applications, and has advised in the media industry on deep learning as applied to image and music processing. Tony lives in Gravesend, Kent, UK, with his wife, Sue McCreeth, who is a renowned musician.
Read more about Tony Holdroyd