为什么tf.keras中的multi_gpu_model比keras中的multi_gpu_model慢得多?

时间:2019-07-12 13:32:19

标签: python tensorflow machine-learning keras multi-gpu

tf.keras中的multi_gpu_model似乎比keras中的multi_gpu_model慢得多。对于给定的示例here,从tensorflow.keras而不是keras导入时,速度要慢大约12倍。

import tensorflow as tf
from tensorflow.keras.applications import Xception
from tensorflow.keras.utils import multi_gpu_model
import numpy as np

num_samples = 1000
height = 224
width = 224
num_classes = 1000

# Instantiate the base model (or "template" model).
# We recommend doing this with under a CPU device scope,
# so that the model's weights are hosted on CPU memory.
# Otherwise they may end up hosted on a GPU, which would
# complicate weight sharing.
with tf.device('/cpu:0'):
    model = Xception(weights=None,
                     input_shape=(height, width, 3),
                     classes=num_classes)

# Replicates the model on 8 GPUs.
# This assumes that your machine has 8 available GPUs.
parallel_model = multi_gpu_model(model, gpus=4)
parallel_model.compile(loss='categorical_crossentropy',
                       optimizer='rmsprop')

# Generate dummy data.
x = np.random.random((num_samples, height, width, 3))
y = np.random.random((num_samples, num_classes))

# This `fit` call will be distributed on 8 GPUs.
# Since the batch size is 256, each GPU will process 32 samples.
parallel_model.fit(x, y, epochs=20, batch_size=256)

# Save model via the template model (which shares the same weights):
model.save('my_model.h5')

唯一的变化是

from tensorflow.keras.applications import Xception
from tensorflow.keras.utils import multi_gpu_model

代替

from keras.applications import Xception
from keras.utils import multi_gpu_model

使用tf.keras

Epoch 1/20

1000/1000 [==============================]-78s 78ms / step-损失:3487.2197 时代2/20 1000/1000 [=============================]-37s 37ms / step-损失:3454.2403 时代3/20 1000/1000 [=============================]-37s 37ms / step-损失:3453.6264 时代4/20 1000/1000 [==============================]-37s 37ms / step-损失:3452.7994 时代5/20 1000/1000 [==============================]-37s 37ms / step-损失:3452.3592

直接从keras导入

Epoch 1/20

1000/1000 [==============================]-52s 52ms / step-损失:3486.8955 时代2/20 1000/1000 [==============================]-3s 3ms / step-损耗:3454.1935 时代3/20 1000/1000 [==============================]-3s 3ms / step-损耗:3453.5585 时代4/20 1000/1000 [==============================]-3s 3ms / step-损耗:3452.8249 时代5/20 1000/1000 [==============================]-3s 3ms / step-损失:3452.1542

从第二个纪元开始,这就像12倍的速度差。 我正在使用最新的keras 2.2.4和tensorflow-gpu-1.10

0 个答案:

没有答案