SciPy / Numpy的Pooling / Convolution比Tensorflow的Convolution / Pooling更快?

时间:2018-06-05 22:17:03

标签: python-2.7 tensorflow scipy

我正在尝试使用GPU来加速我的神经网络应用程序(Spiking网络)中的卷积和池化操作。我写了一个小脚本,看看使用Tensorflow可以获得多少加速。令人惊讶的是,SciPy / Numpy做得更好。在我的应用程序中,所有输入(图像)都存储在磁盘上但是作为一个例子,我创建了一个大小为27x27的随机初始化图像并且加权了大小为5x5x30的内核,我确保我是没有将任何东西从CPU传输到GPU,我也将输入图像大小增加到270x270,权重内核增加到7x7x30,我仍然没有看到任何改进。我确保所有TF方法实际上都是通过设置

在我的GPU上执行的

sess =tf.Session(config=tf.ConfigProto(log_device_placement=True))

我可以在群集上访问2个GPU(Tesla K20m)。

这是我的代码:

import tensorflow as tf
import numpy as np
from scipy import signal
import time
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

image_size = 27
kernel_size = 5
nofMaps = 30

def convolution(Image, weights):
    in_channels = 1 # 1 because our image has 1 units in the -z direction. 
    out_channels = weights.shape[-1]
    strides_1d = [1, 1, 1, 1]

    #in_2d = tf.constant(Image, dtype=tf.float32)
    in_2d = Image
    #filter_3d = tf.constant(weights, dtype=tf.float32)
    filter_3d =weights

    in_width = int(in_2d.shape[0])
    in_height = int(in_2d.shape[1])

    filter_width = int(filter_3d.shape[0])
    filter_height = int(filter_3d.shape[1])

    input_4d   = tf.reshape(in_2d, [1, in_height, in_width, in_channels])
    kernel_4d = tf.reshape(filter_3d, [filter_height, filter_width, in_channels, out_channels])
    inter = tf.nn.conv2d(input_4d, kernel_4d, strides=strides_1d, padding='VALID')
    output_3d = tf.squeeze(inter)
    output_3d= sess.run(output_3d)
    return output_3d


def pooling(Image):
    in_channels = Image.shape[-1]
    Image_3d = tf.constant(Image, dtype = tf.float32)
    in_width = int(Image.shape[0])
    in_height = int(Image.shape[1])
    Image_4d = tf.reshape(Image_3d,[1,in_width,in_height,in_channels])
    pooled_pots4d = tf.layers.max_pooling2d(inputs=Image_4d, pool_size=[2, 2], strides=2)
    pooled_pots3d = tf.squeeze(pooled_pots4d)
    return sess.run(pooled_pots3d)


t1 = time.time()
#with tf.device('/device:GPU:1'):
Image = tf.random_uniform([image_size, image_size], name='Image')
weights = tf.random_uniform([kernel_size,kernel_size,nofMaps], name='Weights')
conv_result = convolution(Image,weights)
pool_result = pooling(conv_result)

print('Time taken:{}'.format(time.time()-t1))
#with tf.device('/device:CPU:0'):
print('Pool_result shape:{}'.format(pool_result.shape))
#print('first map of pool result:\n',pool_result[:,:,0])


def scipy_convolution(Image,weights):
    instant_conv1_pots = np.zeros((image_size-kernel_size+1,image_size-kernel_size+1,nofMaps))
    for i in range(weights.shape[-1]):
        instant_conv1_pots[:,:,i]=signal.correlate(Image,weights[:,:,i],mode='valid',method='fft')
    return instant_conv1_pots

def scipy_pooling(conv1_spikes):
    '''
       Reshape splitting each of the two axes into two each such that the
       latter of the split axes is of the same length as the block size.
       This would give us a 4D array. Then, perform maximum finding along those
       latter axes, which would be the second and fourth axes in that 4D array.
       https://stackoverflow.com/questions/41813722/numpy-array-reshaped-but-how-to-change-axis-for-pooling
    '''
    if(conv1_spikes.shape[0]%2!=0): #if array is odd size then omit the last row and col
        conv1_spikes = conv1_spikes[0:-1,0:-1,:]
    else:
        conv1_spikes = conv1_spikes
    m,n = conv1_spikes[:,:,0].shape
    o   = conv1_spikes.shape[-1]
    pool1_spikes = np.zeros((m/2,n/2,o))
    for i in range(o):
        pool1_spikes[:,:,i]=conv1_spikes[:,:,i].reshape(m/2,2,n/2,2).max(axis=(1,3))
    return pool1_spikes
t1 = time.time()
Image = np.random.rand(image_size,image_size)
weights = np.random.rand(kernel_size,kernel_size,nofMaps)
conv_result = scipy_convolution(Image,weights)
pool_result = scipy_pooling(conv_result)
print('Time taken:{}'.format(time.time()-t1))
print('Pool_result shape:{}'.format(pool_result.shape))
#print('first map of pool result:\n',pool_result[:,:,0])
~   

结果如下:

Time taken:0.746644973755
Pool_result shape:(11, 11, 30)
Time taken:0.0127348899841
Pool_result shape:(11, 11, 30)

1 个答案:

答案 0 :(得分:-1)

根据评论者的建议,我设置image_size=270并将两个convolution and pool函数括在for循环中,现在,TF的效果优于SciPy我正在使用的注意事项tf.nn.conv2d而不是tf.layers.conv2d。我还在use_cudnn_on_gpu=True中设置参数tf.nn.conv2d,但这并没有伤害或帮助。

以下是代码:

import tensorflow as tf
import numpy as np
from scipy import signal
import time
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

image_size = 270
kernel_size = 5
nofMaps = 30

def convolution(Image, weights):
    in_channels = 1 # 1 because our image has 1 units in the -z direction. 
    out_channels = weights.shape[-1]
    strides_1d = [1, 1, 1, 1]

    #in_2d = tf.constant(Image, dtype=tf.float32)
    in_2d = Image
    #filter_3d = tf.constant(weights, dtype=tf.float32)
    filter_3d =weights

    in_width = int(in_2d.shape[0])
    in_height = int(in_2d.shape[1])

    filter_width = int(filter_3d.shape[0])
    filter_height = int(filter_3d.shape[1])

    input_4d   = tf.reshape(in_2d, [1, in_height, in_width, in_channels])
    kernel_4d = tf.reshape(filter_3d, [filter_height, filter_width, in_channels, out_channels])
    inter = tf.nn.conv2d(input_4d, kernel_4d, strides=strides_1d, padding='VALID',use_cudnn_on_gpu=True)
    output_3d = tf.squeeze(inter)
    #t1 = time.time()
    output_3d= sess.run(output_3d)
    #print('TF Time for Conv:{}'.format(time.time()-t1))
    return output_3d


def pooling(Image):
    in_channels = Image.shape[-1]
    Image_3d = tf.constant(Image, dtype = tf.float32)
    in_width = int(Image.shape[0])
    in_height = int(Image.shape[1])
    Image_4d = tf.reshape(Image_3d,[1,in_width,in_height,in_channels])
    pooled_pots4d = tf.layers.max_pooling2d(inputs=Image_4d, pool_size=[2, 2], strides=2)
    pooled_pots3d = tf.squeeze(pooled_pots4d)
    #t1 = time.time()
    pool_res = sess.run(pooled_pots3d)
    #print('TF Time for Pool:{}'.format(time.time()-t1))
    return pool_res


#with tf.device('/device:GPU:1'):
Image = tf.random_uniform([image_size, image_size], name='Image')
weights = tf.random_uniform([kernel_size,kernel_size,nofMaps], name='Weights')
#init = tf.global_variables_initializer
#sess.run(init)
t1 = time.time()
for i in range(150):
    #t1 = time.time()
    conv_result = convolution(Image,weights)
    pool_result = pooling(conv_result)
    #print('TF Time taken:{}'.format(time.time()-t1))
print('TF Time taken:{}'.format(time.time()-t1))
#with tf.device('/device:CPU:0'):
print('TF Pool_result shape:{}'.format(pool_result.shape))
#print('first map of pool result:\n',pool_result[:,:,0])


def scipy_convolution(Image,weights):
    instant_conv1_pots = np.zeros((image_size-kernel_size+1,image_size-kernel_size+1,nofMaps))
    for i in range(weights.shape[-1]):
        instant_conv1_pots[:,:,i]=signal.correlate(Image,weights[:,:,i],mode='valid',method='fft')
    return instant_conv1_pots

def scipy_pooling(conv1_spikes):
    '''
       Reshape splitting each of the two axes into two each such that the
       latter of the split axes is of the same length as the block size.
       This would give us a 4D array. Then, perform maximum finding along those
       latter axes, which would be the second and fourth axes in that 4D array.
       https://stackoverflow.com/questions/41813722/numpy-array-reshaped-but-how-to-change-axis-for-pooling
    '''
    if(conv1_spikes.shape[0]%2!=0): #if array is odd size then omit the last row and col
        conv1_spikes = conv1_spikes[0:-1,0:-1,:]
    else:
        conv1_spikes = conv1_spikes
    m,n = conv1_spikes[:,:,0].shape
    o   = conv1_spikes.shape[-1]
    pool1_spikes = np.zeros((m/2,n/2,o))
    for i in range(o):
        pool1_spikes[:,:,i]=conv1_spikes[:,:,i].reshape(m/2,2,n/2,2).max(axis=(1,3))
    return pool1_spikes
Image = np.random.rand(image_size,image_size)
weights = np.random.rand(kernel_size,kernel_size,nofMaps)
t1 = time.time()
for i in range(150):
    conv_result = scipy_convolution(Image,weights)
    pool_result = scipy_pooling(conv_result)
print('Scipy Time taken:{}'.format(time.time()-t1))
print('Scipy Pool_result shape:{}'.format(pool_result.shape))
#print('first map of pool result:\n',pool_result[:,:,0])

以下是结果:

image_size = 27x27
kernel_size = 5x5x30
iterations = 150
TF Time taken:11.0800771713
TF Pool_result shape:(11, 11, 30)
Scipy Time taken:1.4141368866
Scipy Pool_result shape:(11, 11, 30)

image_size = 270x270
kernel_size = 5x5x30
iterations = 150

TF Time taken:26.2359631062
TF Pool_result shape:(133, 133, 30)
Scipy Time taken:31.6651778221
Scipy Pool_result shape:(11, 11, 30)


image_size = 500x500
kernel_size = 5x5x30
iterations = 150

TF Time taken:89.7967050076
TF Pool_result shape:(248, 248, 30)
Scipy Time taken:143.391746044
Scipy Pool_result shape:(248, 248, 30)

在第二种情况下,您可以看到我减少了大约18%的时间。 在第三种情况下,你可以看到我减少了约38%的时间。