Theano GPU vs pure Numpy (CPU)

In this benchmark, I’ve used a Windows 10 Pro 64 Bit computer with Intel Core i7 6700HQ 2.60 GHz with 32 Gb RAM and NVIDIA GeForce GTX 960M. As a programming environment, I’ve used Python 2.7 (Anaconda distribution) and Jupyter.

The task is very simple, integrating this expression (simple but effective):

The code I’ve written is this (without matplotlib functions and float32 numbers, in order to use the GPU):

import math
from datetime import datetime

import numpy as np
import matplotlib.pyplot as plt

import theano
import theano.tensor as T
from theano import function, shared

# Define constants
a = 0
b = math.pi
precision = 10000000.0
delta = ((b-a) / precision)

# Define x linear space
xs = np.linspace(a, b, num=precision).astype(np.float32)

# Define Theano function
xss = shared(xs, 'xss')
deltas = shared(delta, 'delta')
sinvx = T.sum(T.sin(xss) * deltas)
sf = function([], sinvx)

# Number of iterations
num_executions = 500
execution_times = []

# Theano test
for i in range(num_executions):
    t0 = datetime.now()
    res = sf()
    t1 = datetime.now()

    execution_times.append((t1-t0).microseconds)
   
et = np.array(execution_times)

print ('Theano:')
print ('Result: %f' % res)    
print('Average execution time: %d (us)' % np.average(et))

execution_times = []
 
# Numpy test
for i in range(num_executions):
    t0 = datetime.now()
    res = (sin(xs) * delta).sum()
    t1 = datetime.now()
    
    execution_times.append((t1-t0).microseconds)

et = np.array(execution_times)

print('Numpy:')
print('Result: %f' % res)
print('Average execution time: %d (us)' % np.average(et))

Final results are:

Using gpu device 0: GeForce GTX 960M (CNMeM is enabled with initial size: 20.0% of memory, CuDNN 3007)

Theano:
Result: 2.000000
Average execution time: 39690 (us)

Numpy:
Result: 2.000001
Average execution time: 158240 (us)

So, Numpy is on average 300% slower than Theano (with GPU support). Here the diagrams:

theano

numpy

The spikes should be due to CPU overload, multitasking or memory swapping. However, it’s absolutely clear that Theano (I’m going to test also Tensorflow) should be the best choice if you want to implement deep learning algorithms (in particular if you have a good GPU).

See also:

Machine Learning Algorithms – Giuseppe Bonaccorso

My latest machine learning book has been published and will be available during the last week of July. From the back cover: In this book you will learn all the important Machine Learning algorithms that are commonly used in the field of data science.