“ OperatorNotAllowedInGraphError:在执行图时不允许在tf.Tensor上进行迭代。”使用MirroredStrategy时出错

时间:2020-05-12 09:36:02

标签: tensorflow machine-learning parallel-processing tensorflow2.0

我正在尝试使用mirroredStrategy在多个GPU上训练模型。我一直在关注以下链接: https://www.tensorflow.org/tutorials/distribute/custom_training

现在,我几乎完全遵循了该过程。但仍然出现以下错误:

INFO:tensorflow:Error reported to Coordinator: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

我创建了这样的数据集:

dataset = tf.data.Dataset.from_tensor_slices(list_ds)
train_dataset = dataset.batch(GLOBAL_BATCH_SIZE, drop_remainder=True)
train_dist_dataset = strategy.experimental_distribute_dataset(train_dataset)

list_ds是字符串列表的列表。

这是我的训练循环的样子:

@tf.function
def distributed_train_step(dataset_inputs):
    strategy.experimental_run_v2(train_step, args=dataset_inputs)

for epoch in range(EPOCHS):       
    for batch in train_dist_dataset:
        distributed_train_step(batch)

我还运行了链接中提供的完全相同的代码,并且成功运行了,但是当我在模型上尝试相同的代码时,我得到了错误。

0 个答案:

没有答案