Hello World

Implementation of Neural Style Algorithm with PyTorch

This implementation is based on Alexis-jacq’s tutorial

Original Paper

A Neural Algorithm of Artistic Style

PyTorch Implementation

Packages

The typical PyTorch packages such as torch.nn,torch.autograd.Variable,torch.optim are needed, and torchvision.transforms and torchvision.models are used to manipulate the images.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from __future__ import print_function

import numpy as np

import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.optim as optim

import PIL
from PIL import Image
import matplotlib.pyplot as plt

import torchvision.transforms as transforms
import torchvision.models as models

Load images

Use Image.open() to load the image and pass it to Variable() function of torch.autograd.
Don’t forget to unsqueeze the image to satisfy the network’s requirement of input dimensions.

Content loss

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class ContentLoss(nn.Module):
def __init__(self, target, weight):
super(ContentLoss, self).__init__()
# we 'detach' the target content from the tree used
self.target = target.detach() * weight
# to dynamically compute the gradient: this is a stated value,
# not a variable. Otherwise the forward method of the criterion
# will throw an error.
self.weight = weight
self.criterion = nn.MSELoss()

def forward(self, input):
self.loss = self.criterion.forward(input * self.weight, self.target)
self.output = input
return self.output

def backward(self, retain_variables=True):
self.loss.backward(retain_variables=retain_variables)
return self.loss

Note: self.loss = self.criterion.forward(input * self.weight, self.target) computes the loss.

Important detail: this module, although it is named ContentLoss, is not a true PyTorch Loss function. If you want to define your content loss as a PyTorch Loss, you have to create a PyTorch autograd Function and to recompute/implement the gradient by the hand in the backward method.

Style loss

1
2
3
4
5
6
7
8
9
10
11
12
13
class GramMatrix(nn.Module):
def forward(self, input):
a, b, c, d = input.size() # a=batch size(=1)
# b=number of feature maps
# (c,d)=dimensions of a f. map (N=c*d)

features = input.view(a * b, c * d) # resise F_XL into \hat F_XL

G = torch.mm(features, features.t()) # compute the gram product

# we 'normalize' the values of the gram matrix
# by dividing by the number of element in each feature maps.
return G.div(a * b * c * d)

The longer is the feature maps dimension $N$, the bigger are the values of the gram matrix. Therefore, if we dont normalize by $N$, the loss computed at the first layers (before pooling layers) will have much more importance during the gradient descent. We dont want that, since the most interesting style features are in the deepest layers!

Then, the style loss module is implemented exactly the same way than the content loss module, but we have to add the gramMatrix as a parameter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class StyleLoss(nn.Module):
def __init__(self, target, weight):
super(StyleLoss, self).__init__()
self.target = target.detach() * weight
self.weight = weight
self.gram = GramMatrix()
self.criterion = nn.MSELoss()

def forward(self, input):
self.output = input.clone()
self.G = self.gram.forward(input)
self.G.mul_(self.weight)
self.loss = self.criterion.forward(self.G, self.target)
return self.output

def backward(self, retain_variables=True):
self.loss.backward(retain_variables=retain_variables)
return self.loss

Load the neural network

Now, we have to import a pre-trained neural network. As in the paper, we are going to use a pretrained VGG network with 19 layers (VGG19).
PyTorch’s implementation of VGG is a module divided in two child Sequential modules: features (containing convolution and pooling layers) and classifier (containing fully connected layers). We are just interested by features:

1
2
3
4
5
cnn = models.vgg19(pretrained=True).features

# move it to the GPU if possible:
if use_cuda:
cnn = cnn.cuda()

A Sequential module contains an ordered list of child modules. For instance, alexnet.features contains a sequence (Conv2d, ReLU, Maxpool2d, Conv2d, ReLU…) aligned in the right order of depth. As we said in Content loss section, we wand to add our style and content loss modules as additive ‘transparent’ layers in our network, at desired depths. For that, we construct a new Sequential module, in wich we are going to add modules from alexnet and our loss modules in the right order:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
cnn = models.vgg19(pretrained=True).features

# move it to the GPU if possible:
if use_cuda:
cnn = cnn.cuda()
A Sequential module contains an ordered list of child modules. For instance, alexnet.features contains a sequence (Conv2d, ReLU, Maxpool2d, Conv2d, ReLU...) aligned in the right order of depth. As we said in Content loss section, we wand to add our style and content loss modules as additive 'transparent' layers in our network, at desired depths. For that, we construct a new Sequential module, in wich we are going to add modules from alexnet and our loss modules in the right order:
In [ ]:
# desired depth layers to compute style/content losses :
content_layers = ['conv_4']
style_layers = ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5']

# just in order to have an iterable access to or list of content/syle losses
content_losses = []
style_losses = []

model = nn.Sequential() # the new Sequential module network
gram = GramMatrix() # we need a gram module in order to compute style targets

# move these modules to the GPU if possible:
if use_cuda:
model = model.cuda()
gram = gram.cuda()

# weigth associated with content and style losses
content_weight = 1
style_weight = 1000

i = 1
for layer in list(cnn):
if isinstance(layer, nn.Conv2d):
name = "conv_" + str(i)
model.add_module(name, layer)

if name in content_layers:
# add content loss:
target = model.forward(content).clone()
content_loss = ContentLoss(target, content_weight)
model.add_module("content_loss_" + str(i), content_loss)
content_losses.append(content_loss)

if name in style_layers:
# add style loss:
target_feature = model.forward(style).clone()
target_feature_gram = gram.forward(target_feature)
style_loss = StyleLoss(target_feature_gram, style_weight)
model.add_module("style_loss_" + str(i), style_loss)
style_losses.append(style_loss)

if isinstance(layer, nn.ReLU):
name = "relu_" + str(i)
model.add_module(name, layer)

if name in content_layers:
# add content loss:
target = model.forward(content).clone()
content_loss = ContentLoss(target, content_weight)
model.add_module("content_loss_" + str(i), content_loss)
content_losses.append(content_loss)

if name in style_layers:
# add style loss:
target_feature = model.forward(style).clone()
target_feature_gram = gram.forward(target_feature)
style_loss = StyleLoss(target_feature_gram, style_weight)
model.add_module("style_loss_" + str(i), style_loss)
style_losses.append(style_loss)

i += 1

if isinstance(layer, nn.MaxPool2d):
name = "pool_" + str(i)
model.add_module(name, layer)