Question

我试图重现代码的多gpu版本，而ResNet的模型体系结构几乎没有变化（其余相同），如此处https://github.com/FlyEgle/keras-yolo3所示。在train_height_point.py下。直接链接：https://github.com/FlyEgle/keras-yolo3/blob/master/train_height_point.py

错误似乎在Yolo_loss函数中

我尝试修改while_loop和其他stackoverflow解决方案中提到的其他技巧 Gradients error using TensorArray Tensorflow TensorArray TensorArray_1_0: Could not read from TensorArray index 0 because it has not yet been written to https://github.com/tensorflow/tensorflow/issues/3663

运行代码时，在第一个时期出现以下错误

Train on 62880 samples, val on 6976 samples, with batch size 1.
Epoch 1/400
2019-06-28 18:39:30.247036: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_3: Could not read from TensorArray index 0.  Furthermore, the element shape is not fully defined: [?,?,3].  It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written.  If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
2019-06-28 18:39:30.251868: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_1_4: Could not read from TensorArray index 0.  Furthermore, the element shape is not fully defined: [?,?,3].  It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written.  If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
2019-06-28 18:39:30.251942: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at tensor_array_ops.cc:661 : Invalid argument: TensorArray replica_0/model_3/yolo_loss/TensorArray_2_5: Could not read from TensorArray index 0.  Furthermore, the element shape is not fully defined: [?,?,3].  It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written.  If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
2019-06-28 18:39:31.368047: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
Traceback (most recent call last):
  File "train.py", line 517, in <module>
    _main()
  File "train.py", line 177, in _main
    callbacks=[logging, lr_schedule, checkpoint]
  File "/opt/conda/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/opt/conda/lib/python3.7/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
    class_weight=class_weight)
  File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 1217, in train_on_batch
    outputs = self.train_function(ins)
  File "/opt/conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/opt/conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))

tensorflow.python.framework.errors_impl.InvalidArgumentError: TensorArray replica_0/model_3/yolo_loss/TensorArray_3: Could not read from TensorArray index 0.  Furthermore, the element shape is not fully defined: [?,?,3].  It is possible you are working with a resizeable TensorArray and stop_gradients is not allowing the gradients to be written.  If you set the full element_shape property on the forward TensorArray, the proper all-zeros tensor will be returned instead of incurring this error.
     [[{{node replica_0/model_3/yolo_loss/TensorArrayStack/TensorArrayGatherV3}}]]
     [[{{node loss/add_20}}]]

Answer 1

根据上面的stacktrace，您需要传递一个名为element_shape的参数，其定义完全像element_shape(10, 10, 10)而不是None或element_shape=(None, 10, 10)。似乎不可能存在未知维度。

我也有这个问题，并尝试找到一种更好的方法来解决它。

无法从TensorArray索引0中读取。可能您正在使用可调整大小的TensorArray。 stop_gradients不允许编写渐变

1 个答案: