无法将张量添加到批处理中:元素数量不匹配。形状为:[张量]:[585,1024,3],[批处理]:[600,799,3]

时间:2019-11-23 10:33:43

标签: python tensorflow

我正在尝试训练一个模型,起初我有5000幅图像的数据集,并且训练很好,现在我又添加了几张图像,现在我的数据集包含6,423张图像。我在Ubuntu 18.04上使用python 3.6.1,我的tensorflow版本是1.15&numpy版本是1.16(之前有相同的版本并且工作正常)。 现在,当我使用时:

python model_main.py --logtostderr --pipeline_config_path=training/faster_rcnn_resnet50_coco.config --model_dir=training

它会在几分钟后以及以下几行后开始设置:

INFO:tensorflow:Saving checkpoints for 0 into training/model.ckpt. 
I1123 10:26:21.548237 140482563244160 basic_session_run_hooks.py:606] Saving checkpoints for 0 into training/model.ckpt. 
2019-11-23 10:28:30.801453: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 

我收到以下错误消息:

2019-11-23 10:08:38.843259: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_3_hash_table_2/N10tensorflow6lookup15LookupInterfaceE does not exist.               
2019-11-23 10:08:38.843323: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_1_hash_table_1/N10tensorflow6lookup15LookupInterfaceE does not exist.               
2019-11-23 10:08:38.843345: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_2_hash_table/N10tensorflow6lookup15LookupInterfaceE does not exist.                 
2019-11-23 10:08:38.851405: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_3_hash_table_2/N10tensorflow6lookup15LookupInterfaceE does not exist.               
2019-11-23 10:08:38.851488: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_1_hash_table_1/N10tensorflow6lookup15LookupInterfaceE does not exist.               
2019-11-23 10:08:38.851512: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_2_hash_table/N10tensorflow6lookup15LookupInterfaceE does not exist.                 
2019-11-23 10:08:38.851807: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_1_hash_table_1/N10tensorflow6lookup15LookupInterfaceE does not exist.               
2019-11-23 10:08:38.851848: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_2_hash_table/N10tensorflow6lookup15LookupInterfaceE does not exist.                 
2019-11-23 10:08:38.851899: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:788 : Not found: Resource localhost/_3_hash_table_2/N10tensorflow6lookup15LookupInterfaceE does not exist.               
Traceback (most recent call last):                                                                                                                                                                                                             
File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call                                                                                                                                 
 return fn(*args)                                                                                                                                                                                                                           
File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn                                                                                                                                  
 target_list, run_metadata)                                                                                                                                                                                                                 
File "/usr/local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun                                                                                                                      
 run_metadata)                                                                                                                                                                                                                            
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.                                                                                                                                                           
(0) Invalid argument: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [585,1024,3], [batch]: [600,799,3]                                                                                                   
[[{{node IteratorGetNext}}]]                                                                                                                                                                                                                 
[[ToAbsoluteCoordinates_118/Assert/AssertGuard/Assert/data_0/_5709]]                                                                                                                                                                  
(1) Invalid argument: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [585,1024,3], [batch]: [600,799,3]                                                                                                   
[[{{node IteratorGetNext}}]]                                                                                                                                                                                                        
0 successful operations.                                                                                                                                                                                                                     
0 derived errors ignored. 

并停止训练。

3 个答案:

答案 0 :(得分:3)

您添加的新图片似乎具有585x1024的分辨率,这与模型预期的尺寸(即600x799)不同。

如果是这样,那么解决方案是相应地调整这些新图像的大小。

答案 1 :(得分:1)

将batch_size更改为1可以为我解决此问题。

答案 2 :(得分:0)

如果您需要批处理大小> 1,则可以在配置中使用正确的image_resizer(定义为in the image_resizer protobuf file的其中之一)将图像调整为统一大小。来解析配置的那部分。

例如(从here窃取):

image_resizer {
  fixed_shape_resizer {
    height: 600
    width: 800
  }
}

这似乎为我解决了这个问题。