Keras在GPU上的训练性能比CPU慢

时间:2019-04-24 17:27:48

标签: python tensorflow keras deep-learning

几天前,我才刚刚开始学习深度学习。 我正在尝试使用包含猫和狗图片的数据集来训练CNN。

但是问题是训练过程在CPU上花费的时间太长,而且我可以看到在训练期间我的CPU使用率没有显着提高。而且培训过程运行非常缓慢。我以为是因为CPU,所以我继续安装CUDA和TensorFlow-gpu,Tensorflow-gpu似乎可以在控制台上显示图形卡。

Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4620 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:1c:00.0, compute capability: 7.5)

但是在GPU上的训练速度似乎比在CPU上慢。

在训练模型时,它会占用所有GPU内存,但训练时GPU利用率似乎很低。你们能检查我在做什么错吗?以及如何加快速度?

系统规格:

CPU:Ryzen 2700x

GPU:ZOTAC RTX 2060

RAM:DDR4 16GB 3000 MHZ

我正在使用的代码:

import keras as keras
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

# Initialising the CNN
classifier = Sequential()

# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))

# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Step 3 - Flattening
classifier.add(Flatten())

# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))

# Compiling the CNN
classifier.compile(optimizer = 'adadelta', loss = 'binary_crossentropy', metrics = ['accuracy'])

# Part 2 - Fitting the CNN to the images

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'binary')

classifier.fit_generator(training_set,
                         steps_per_epoch = 8000,
                         epochs = 35,
                         validation_data = test_set,
                         validation_steps = 20)

print("================== Saving Model ==========================")
classifier.save('./model.h5')
print("=================  Model Saved  ==========================")

速度:

  20/8000 [..............................] - ETA: 10:47 - loss: 0.0269 - acc: 0.9906

0 个答案:

没有答案