Question

当我尝试使用TPU在Google colab上训练带有我自己的图像的图像检测器时，出现此错误：

来自/ job：worker / replica：0 / task：0：编译失败：要求从hlo％convert.283 = f32 [1,80,80,32] {3,2,1,0} convert（f32 [1,80,80,32] {3， 2,1,0}％add.1），元数据= {op_type =“ FusedBatchNorm” op_name =“ bn_Conv1_3 / FusedBatchNorm”} @ {} @ 0到％clamp.288 = f32 [1,80,80,32] {3,2,1,0}钳位（f32 [1,80,80,32] {3,2,1,0}％broadcast.286，f32 [1,80,80,32] {3,2， 1,0}％convert.283，f32 [1,80,80,32] {3,2,1,0}％broadcast.287），元数据= {op_type =“ Relu6” op_name =“ Conv1_relu_3 / Relu6”} ，尚未实现。 TPU编译失败 [[节点TPUReplicateMetadata_1（定义为：24）]]

以下是代码的链接：

https://drive.google.com/open?id=1mPiod1At85RgNwHx4vYFxH38Ck16Ep1m

您对发生的事情有任何了解吗？

我看过它一定不是图片大小问题或批处理大小问题。

谢谢。

Answer 1

我认为问题出在您的标签上。请尝试以下代码：

y_train = tf.keras.utils.to_categorical(labels, NUM_CLASSES)
y_test = tf.keras.utils.to_categorical(labelstest, NUM_CLASSES)
zeros = tf.zeros([NUM_CLASSES], tf.int32)
y_train  = tf.math.add(y_train,zeros)
y_test = tf.math.add(y_train,zeros)

如何使用tpu解决tf.keras中的“动态尺寸传播失败”错误？

1 个答案: