通过使用MirroredStrategy(),模型陷入困境

时间:2020-09-06 17:18:07

标签: tensorflow keras deep-learning cudnn multiple-gpu

我正在尝试使用Tensorflow中的MirroredStrategy()在Keras中使用多个GPU(2)。但是,它导致以下错误:

Epoch 1/5
WARNING:tensorflow:From /home/user/conda36/lib/python3.6/site-packages/tensorflow/python/data/ops/multi_device_iterator_ops.py:601: get_next_as_optional (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Iterator.get_next_as_optional()` instead.
2020-09-06 18:50:54.766930: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-06 18:50:55.069925: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-06 18:50:56.049400: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at depthwise_conv_op.cc:386 : Invalid argument: Computed output size would be negative: -1 [input_size: 3, effective_filter_size: 5, stride: 1]

此时,模型被冻结。这意味着它仍在运行,但是什么也没做。

如果我在没有MirroredStrategy()的情况下运行它,则可以正常运行,但是当然它仅使用1个GPU。

我这样使用MirroredStrategy():

def get_model():
    .
    .
    .
    decoded = Conv2D(1, (3, 3), activation='linear', padding='same')(d)

    autoencoder = Model(input_img, decoded)
    autoencoder.summary()
    autoencoder.compile(optimizer='Adagrad', loss='mean_squared_error')
    return autoencoder

if __name__ == '__main__':
    # Model
    strategy = tf.distribute.MirroredStrategy()
    with strategy.scope():
        autoencoder = get_model()

可能是什么错误?

0 个答案:

没有答案