具有Tensoflow后端的Keras分配GPU内存但不使用GPU

时间:2017-03-31 10:36:04

标签: python tensorflow keras

我正在使用带有Tensorflow后端的Keras。当运行nvidia-smi时,我可以看到python在GPU上分配内存,但它似乎没有使用它。此外,计算运行速度非常慢(~300秒而非~15秒)。我使用的是GTX 980。

我的Python 3代码:

# coding: utf-8

# ## Set up Libraries

import keras as K
import numpy as np

from keras.layers import Activation, Dense, Flatten, Lambda
from keras.models import Sequential
from keras.optimizers import Adam
from keras.preprocessing.image import ImageDataGenerator


# ## Code

def one_hot(x):
    matrix = np.zeros([x.size, np.max(x) + 1])
    matrix[np.arange(x.size), x] = 1
    return matrix


# ## Prepare Data

from keras.datasets import mnist
(X_train, Y_train_raw), (X_test, Y_test_raw) = mnist.load_data()


X_train = np.expand_dims(X_train, 3)
Y_train_raw = np.expand_dims(Y_train_raw, 3)
X_test = np.expand_dims(X_test, 3)
Y_test_raw = np.expand_dims(Y_test_raw, 3)


mnist_mean = X_train.mean().astype(np.float32)
mnist_stddev = X_train.std().astype(np.float32)
def normalize_mnist_input(x):
    return (x - mnist_mean) / mnist_stddev


Y_train = one_hot(Y_train_raw)
Y_test = one_hot(Y_test_raw)


X_valid = X_train[50000:]
Y_valid = X_train[50000:]
X_train = X_train[0:50000]
Y_train = Y_train[0:50000]


# ## Fit Simple Model

def linear_model():
    model = Sequential([
        Lambda(normalize_mnist_input, input_shape=(28, 28, 1)),
        Flatten(),
        Dense(10, activation="softmax")
    ])
    model.compile(Adam(), loss="categorical_crossentropy", metrics=['accuracy'])
    return model

linear_model = linear_model()


image_generator = ImageDataGenerator()
train_batches = image_generator.flow(X_train, Y_train, batch_size=64)
test_batches = image_generator.flow(X_test, Y_test, batch_size=64)


linear_model.fit_generator(train_batches, train_batches.n, 
                           validation_data=test_batches, validation_steps=test_batches.n, 
                           epochs=1)

当我运行此测试脚本时,它使用GPU:

import tensorflow as tf

# Creates a graph.
with tf.device('/gpu:0'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))

我也在使用nvidia-docker,它在运行测试泊坞窗图像时有效。

我的Dockerfile看起来基本上是这样的:

FROM nvidia/cuda:8.0-cudnn5-devel-ubuntu16.04

### basic utilities
RUN apt-get update && \
    apt-get --assume-yes upgrade && \
    apt-get --assume-yes install binutils build-essential curl gcc git g++ \
        libfreetype6-dev libpng12-dev libzmq3-dev pkg-config make nano rsync \
        software-properties-common unzip wget && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*



### anaconda & tensorflow
RUN cd /tmp && \
    wget "https://repo.continuum.io/archive/Anaconda3-4.3.1-Linux-x86_64.sh" -O "Anaconda.sh" && \
    bash "Anaconda.sh" -b && \
    echo "export PATH=\"$HOME/anaconda3/bin:\$PATH\"" >> ~/.bashrc && \
    export PATH="$HOME/anaconda3/bin:$PATH" && \
    conda install -y bcolz && \
    conda upgrade -y --all && \
    pip install Pillow && \
    pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp36-cp36m-linux_x86_64.whl && \
    pip install keras

为什么我的Python脚本在顶层没有正确使用GPU?我可以看到它分配了大约3千兆字节的内存,但它没有用它进行处理。

0 个答案:

没有答案