Neural artistic style transfer experiments with Keras

Fork

Artistic style transfer using neural networks is a technique proposed by Gatys, Ecker and Bethge in the paper: arXiv:1508.06576 [cs.CV] which exploits a trained convolutional network in order to reconstruct the elements of a picture adopting the artistic style of a particular painting.

I’ve written a Python program (available in the Github repository: https://github.com/giuseppebonaccorso/Neural_Artistic_Style_Transfer) based on Keras and VGG16/19 convolutional networks, that can be used to perform some experiments. In fact, considering the huge number of variables and parameters, this kind of problems is very sensitive to the initial conditions and a different starting state can lead to different minima which content doesn’t meet our requirements. In the script, it’s possible to choose among six initial canvas types:

  • Random: RGB random pixels from a uniform distribution
  • Random from style: random pixels sampled from the painting
  • Random from picture: random pixels sampled from the picture
  • Style/Picture: Painting or picture full original content
  • Custom: picture/canvas loaded from disk

These different canvases can be used in (at least) three different “templates” (naming is mine), which values are also defined in the script:

  • Style and picture over random canvas: this approach uses a “random-from-style” canvas in order to rebuild all picture elements using a specific artistic style (shapes, traits and colors)
  •  Style over picture: starting from the picture, this method can be used to rebuild the original content keeping most of the shapes but adopting artistic traits and colors from a particular painting
  • Picture over style:  starting from the painting, this approach tries to merge both contents in an oniric fashion where the artistic style is predominant and the original picture drives a complex rearrangement of many painting shapes in order to let it accommodate its own main elements.

All the experiments have been performed using both a standard stochastic gradient descent (with momentum or RMSProp) approach and two different minimization algorithms: Limited memory BFGS (L-BFGS) and Conjugate gradient (CG). L-BFGS is the fastest but often drives to a premature precision loss (in particular with GPU Float 32 numbers). My favorite choice is CG but it’s possible to try other algorithms already implemented by SciPy.

The convolutional network is VGG16 and the picture is always the same Tübingen panorama used in the paper and shown also in the Wikipedia page. “Alpha-style” and “Alpha-content” are the weights of painting and picture contents in the global loss function. Please, read the paper for further and more detailed information. VGG16 layers used for style loss evaluation are: block(1,2,3,4,5)_conv1. Even if there are no dimension constraints, all the images are 512 x 512 px, in order to speed up the process.

First experiment: V. Van Gogh, The starry night

Style and picture over random canvas (Alpha-style = 1.0, Alpha-content = 0.25, Picture layer = ‘block4_conv1’):

Starry night

9 CG iterations with a total number of 4500 steps

Style over picture (Alpha-style = 0.0025, Alpha-content = 1.0, Picture layer = ‘block4_conv1’):

Starry night 2

28 CG iterations with a total number of 13000 steps

Picture over style (Alpha-style = 0.001, Alpha-content = 1.0, Picture layer = ‘block5_conv1’):

19 CG iterations with a total number of 3120 steps

Second experiment: G. Braque, Port en Normandie

Style and picture over random canvas (Alpha-style = 1.0, Alpha-content = 0.25, Picture layer = ‘block4_conv1’):

Port 1

20 CG iterations with a total number of 6400 steps

Style over picture (Alpha-style = 0.0025, Alpha-content = 1.0, Picture layer = ‘block4_conv1’):

Port 2

2 CG iterations with a total number of 320 steps

Picture over style (Alpha-style = 0.001, Alpha-content = 1.0, Picture layer = ‘block5_conv1’):

Port 3

42 CG iterations with a total number of 2300 steps

Third experiment: S. Dalì, Ship with butterfly sails:

Style and picture over random canvas (Alpha-style = 1.0, Alpha-content = 0.25, Picture layer = ‘block4_conv1’):

Ship 1

15 CG iterations with a total number of 2315 steps

Style over picture (Alpha-style = 0.0025, Alpha-content = 1.0, Picture layer = ‘block4_conv1’):

Dalì 2

3 CG iterations with a total number of 1000 steps

Picture over style (Alpha-style = 0.001, Alpha-content = 1.0, Picture layer = ‘block5_conv1’):

Dalì 3

15 CG iterations with a total number of 3600 steps

Source code (Python 2.7, Keras, SciPy):

'''
Neural artistic styler
Based on: Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, "A Neural Algorithm of Artistic Style", arXiv:1508.06576
Examples: https://www.bonaccorso.eu
See also: https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py
Giuseppe Bonaccorso (https://www.bonaccorso.eu)
'''

from __future__ import print_function

import numpy as np
import math
import keras.backend as K

from keras.applications import vgg16, vgg19
from keras.applications.imagenet_utils import preprocess_input

from scipy.optimize import minimize
from scipy.misc import imread, imsave, imresize

# Set random seed (for reproducibility)
np.random.seed(1000)


class NeuralStyler(object):
    def __init__(self, picture_image_filepath, style_image_filepath, destination_folder,
                 weights_filepath=None,
                 alpha_picture=1.0, alpha_style=0.00001,
                 save_every_n_steps=20,
                 verbose=False,
                 convnet='VGG16',
                 picture_layer='block5_conv1',
                 style_layers=('block1_conv1',
                               'block2_conv1',
                               'block3_conv1',
                               'block4_conv1',
                               'block5_conv1')):
        '''
        Artistic neural styler based on VGG16/19 convolutional network
        Based on: Leon A. Gatys, Alexander S. Ecker, Matthias Bethge,
                  "A Neural Algorithm of Artistic Style", arXiv:1508.06576
                  
        Parameters
        ----------
        :param picture_image_filepath: Content file path
        :param style_image_filepath: Style file path
        :param destination_folder: Result destination folder
        :param weights_filepath: Optional VGG19 weights filepath (HDF5)
        :param alpha_picture: Content loss function weight
        :param alpha_style: Style loss function weight
        :param save_every_n_steps: Save a picture every n optimization steps
        :param verbose: Print loss function values
        :param convnet: One of: VGG16 or VGG19
        :param picture_layer: Convnet layer used in the content loss function
        :param style_layers: Convnet layers used in the style loss function
        
        Usage examples
        ----------
        Picture and style over random:
            canvas='random_from_style'
            alpha_style=1.0
            alpha_picture=0.25
            picture_layer='block4_conv1' (both VGG16 and VGG19)
        Style over picture:
            canvas='picture'
            alpha_style=0.0025
            alpha_picture=1.0
            picture_layer='block4_conv1' (both VGG16 and VGG19)
        Picture over style:
            canvas='style'
            alpha_style=0.001
            alpha_picture=1.0
            picture_layer='block5_conv1' (both VGG16 and VGG19)
        '''

        if picture_image_filepath is None or style_image_filepath is None or destination_folder is None:
            raise ValueError('Picture, style image or destination filepath is/are missing')

        if convnet not in ('VGG16', 'VGG19'):
            raise ValueError('Convnet must be one of: VGG16 or VGG19')

        self.picture_image_filepath = picture_image_filepath
        self.style_image_filepath = style_image_filepath
        self.destination_folder = destination_folder

        self.alpha_picture = alpha_picture
        self.alpha_style = alpha_style
        self.save_every_n_steps = save_every_n_steps
        self.verbose = verbose

        self.layers = style_layers if picture_layer in style_layers else style_layers + (picture_layer,)

        self.iteration = 0
        self.step = 0
        self.styled_image = None

        # Create convnet
        print('Creating convolutional network')
        if convnet == 'VGG16':
            convnet = vgg16.VGG16(include_top=False, weights='imagenet' if weights_filepath is None else None)
        else:
            convnet = vgg19.VGG19(include_top=False, weights='imagenet' if weights_filepath is None else None)

        if weights_filepath is not None:
            print('Loading model weights from: %s' % weights_filepath)
            convnet.load_weights(filepath=weights_filepath)

        # Convnet output function
        self.get_convnet_output = K.function(inputs=[convnet.layers[0].input],
                                             outputs=[convnet.get_layer(t).output for t in self.layers])

        # Load picture image
        original_picture_image = imread(picture_image_filepath)

        self.image_shape = (original_picture_image.shape[0], original_picture_image.shape[1], 3)
        self.e_image_shape = (1,) + self.image_shape
        self.picture_image = self.pre_process_image(original_picture_image.reshape(self.e_image_shape).astype(K.floatx()))

        print('Loading picture: %s (%dx%d)' % (self.picture_image_filepath,
                                               self.picture_image.shape[2],
                                               self.picture_image.shape[1]))

        picture_tensor = K.variable(value=self.get_convnet_output([self.picture_image])[self.layers.index(picture_layer)])

        # Load style image
        original_style_image = imread(self.style_image_filepath)

        print('Loading style image: %s (%dx%d)' % (self.style_image_filepath,
                                                   original_style_image.shape[1],
                                                   original_style_image.shape[0]))

        # Check for style image size
        if (original_style_image.shape[0] != self.picture_image.shape[1]) or 
                (original_style_image.shape[1] != self.picture_image.shape[2]):
            # Resize image
            print('Resizing style image to match picture size: (%dx%d)' %
                  (self.picture_image.shape[2], self.picture_image.shape[1]))

            original_style_image = imresize(original_style_image,
                                            size=(self.picture_image.shape[1], self.picture_image.shape[2]),
                                            interp='lanczos')

        self.style_image = self.pre_process_image(original_style_image.reshape(self.e_image_shape).astype(K.floatx()))

        # Create style tensors
        style_outputs = self.get_convnet_output([self.style_image])
        style_tensors = [self.gramian(o) for o in style_outputs]

        # Compute loss function(s)
        print('Compiling loss and gradient functions')

        # Picture loss function
        picture_loss_function = 0.5 * K.sum(K.square(picture_tensor - convnet.get_layer(picture_layer).output))

        # Style loss function
        style_loss_function = 0.0
        style_loss_function_weight = 1.0 / float(len(style_layers))

        for i, style_layer in enumerate(style_layers):
            style_loss_function += 
                (style_loss_function_weight *
                (1.0 / (4.0 * (style_outputs[i].shape[1] ** 2.0) * (style_outputs[i].shape[3] ** 2.0))) *
                 K.sum(K.square(style_tensors[i] - self.gramian(convnet.get_layer(style_layer).output))))

        # Composite loss function
        composite_loss_function = (self.alpha_picture * picture_loss_function) + 
                                  (self.alpha_style * style_loss_function)

        loss_function_inputs = [convnet.get_layer(l).output for l in self.layers]
        loss_function_inputs.append(convnet.layers[0].input)

        self.loss_function = K.function(inputs=loss_function_inputs,
                                        outputs=[composite_loss_function])

        # Composite loss function gradient
        loss_gradient = K.gradients(loss=composite_loss_function, variables=[convnet.layers[0].input])

        self.loss_function_gradient = K.function(inputs=[convnet.layers[0].input],
                                                 outputs=loss_gradient)

    def fit(self, iterations=100, canvas='random', canvas_image_filepath=None, optimization_method='CG'):
        '''
        Create styled image
        :param iterations: Number of optimization iterations
        :param canvas: One of:
            'random': RGB random image
            'random_from_style': random image generated from style image pixels
            'random_from_picture': random image generated from picture pixels
            'style': Style image
            'picture': Picture
            'custom': Custom image specified by canvas_image paramater
        :param canvas_image_filepath: Canvas initial image file path
        :param optimization_method: Optimization method passed to SciPy optimize
            (See https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.optimize.minimize.html
            for further information)
            Allowed options are:
                - Nelder-Mead
                - Powell
                - CG (Default)
                - BFGS
                - Newton-CG
                - L-BFGS-B
                - TNC
                - COBYLA
                - SLSQP
                - dogleg
                - trust-ncg
        '''

        if canvas not in ('random', 'random_from_style', 'random_from_picture', 'style', 'picture', 'custom'):
            raise ValueError('Canvas must be one of: random, random_from_style, '
                             'random_from_picture, style, picture, custom')

        # Generate random image
        if canvas == 'random':
            self.styled_image = self.pre_process_image(np.random.uniform(0, 256,
                                                                         size=self.e_image_shape).astype(K.floatx()))
        elif canvas == 'style':
            self.styled_image = self.style_image.copy()
        elif canvas == 'picture':
            self.styled_image = self.picture_image.copy()
        elif canvas == 'custom':
            self.styled_image = self.pre_process_image(imread(canvas_image_filepath).
                                                       reshape(self.e_image_shape).astype(K.floatx()))
        else:
            self.styled_image = np.ndarray(shape=self.e_image_shape)

            for x in range(self.picture_image.shape[2]):
                for y in range(self.picture_image.shape[1]):
                    x_p = np.random.randint(0, self.picture_image.shape[2] - 1)
                    y_p = np.random.randint(0, self.picture_image.shape[1] - 1)
                    self.styled_image[0, y, x, :] = 
                        self.style_image[0, y_p, x_p, :] if canvas == 'random_from_style' 
                        else self.picture_image[0, y_p, x_p, :]

        bounds = None

        # Set bounds if the optimization method supports them
        if optimization_method in ('L-BFGS-B', 'TNC', 'SLSQP'):
            bounds = np.ndarray(shape=(self.styled_image.flatten().shape[0], 2))
            bounds[:, 0] = -128.0
            bounds[:, 1] = 128.0

        print('Starting optimization with method: %r' % optimization_method)

        for _ in range(iterations):
            self.iteration += 1

            if self.verbose:
                print('Starting iteration: %d' % self.iteration)

            minimize(fun=self.loss, x0=self.styled_image.flatten(), jac=self.loss_gradient,
                     callback=self.callback, bounds=bounds, method=optimization_method)

            self.save_image(self.styled_image)

    def loss(self, image):
        outputs = self.get_convnet_output([image.reshape(self.e_image_shape).astype(K.floatx())])
        outputs.append(image.reshape(self.e_image_shape).astype(K.floatx()))

        v_loss = self.loss_function(outputs)[0]

        if self.verbose:
            print('tLoss: %.2f' % v_loss)

        # Check whether loss has become NaN
        if math.isnan(v_loss):
            print('NaN Loss function value')

        return v_loss

    def loss_gradient(self, image):
        return np.array(self.loss_function_gradient([image.reshape(self.e_image_shape).astype(K.floatx())])).
            astype('float64').flatten()

    def callback(self, image):
        self.step += 1
        self.styled_image = image.copy()

        if self.verbose:
            print('Optimization step: %d/%d' % (self.step, self.iteration))

        if self.step == 1 or self.step % self.save_every_n_steps == 0:
            self.save_image(image)

    def save_image(self, image):
        imsave(self.destination_folder + 'img_' + str(self.step) + '_' + str(self.iteration) + '.jpg',
               self.post_process_image(image.reshape(self.e_image_shape).copy()))

    @staticmethod
    def gramian(filters):
        c_filters = K.batch_flatten(K.permute_dimensions(K.squeeze(filters, axis=0), pattern=(2, 0, 1)))
        return K.dot(c_filters, K.transpose(c_filters))

    @staticmethod
    def pre_process_image(image):
        return preprocess_input(image)

    @staticmethod
    def post_process_image(image):
        image[:, :, :, 0] += 103.939
        image[:, :, :, 1] += 116.779
        image[:, :, :, 2] += 123.68
        return np.clip(image[:, :, :, ::-1], 0, 255).astype('uint8')[0]

Code snippets:

neural_styler = NeuralStyler(picture_image_filepath='imgGB.jpg',
                             style_image_filepath='imgMagritte.jpg',
                             destination_folder='destination_folder',
                             alpha_picture=0.4,
                             alpha_style=0.6,
                             verbose=True)

neural_styler.fit(canvas='picture', optimization_method='L-BFGS-B')

# --------

neural_styler = NeuralStyler(picture_image_filepath='imgGB.jpg',
                             style_image_filepath='imgMagritte.jpg',
                             destination_folder='destination_folder',
                             alpha_picture=0.25,
                             alpha_style=1.0,
                             picture_layer='block4_conv1',
                             style_layers=('block1_conv1',
                                           'block2_conv1',
                                           'block3_conv1',
                                           'block4_conv1',
                                           'block5_conv1'))
                                               
neural_styler.fit(canvas='random_from_style', optimization_method='CG')

See also:

Keras-based Deepdream experiment based on VGG19 – Giuseppe Bonaccorso

I’ve just published a repository ( https://github.com/giuseppebonaccorso/keras_deepdream) with a Jupyter notebook containing a Deepdream ( https://github.com/google/deepdream) experiment created with Keras and a pre-trained VGG19 convolutional network. The experiment (which is a work in progress) is based on some suggestions provided by the Deepdream team in this blog post but works in a slightly different way.