我正在尝试使用tensorflow / models / research / objectdetection中的model_main.py训练SSD Lite + MobileNetv2,但出现以下错误
Assign requires shapes of both tensors to match. lhs shape= [1,1,256,256] rhs shape= [1,1,1280,256] [[Node: save/Assign_348 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights, save/RestoreV2:348)]]
完整的日志在这里
python E:\Documents\Projects\tensorflow\models\research\object_detection\model_main.py --alsologtostderr --pipeline_config_path=experiments/training_/ssdlite_mobilenet_v2_coco.config --model_dir=experiments/training_/ --num_train_steps=50000 --NUM_EVAL_STEPS=2000
WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W1123 09:11:18.686478 7432 tf_logging.py:125] Forced number of epochs for all eval validations to be 1.
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1.
W1123 09:11:18.687448 7432 tf_logging.py:125] Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1.
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn..model_fn at 0x000001E641D31268>) includes params argument, but params are not passed to Estimator.
W1123 09:11:18.688472 7432 tf_logging.py:125] Estimator's model_fn (<function create_model_fn..model_fn at 0x000001E641D31268>) includes params argument, but params are not passed to Estimator.
2018-11-23 09:11:23.084879: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-11-23 09:11:23.365711: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1392] Found device 0 with properties:
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.835
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-11-23 09:11:23.372771: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1471] Adding visible gpu devices: 0
2018-11-23 09:11:24.001841: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-23 09:11:24.004967: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:958] 0
2018-11-23 09:11:24.007058: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0: N
2018-11-23 09:11:24.009175: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4741 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\client\session.py", line 1322, in _do_call
return fn(*args)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\client\session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\client\session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,256,256] rhs shape= [1,1,1280,256]
[[Node: save/Assign_348 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights, save/RestoreV2:348)]]
[[Node: save/RestoreV2/_599 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_728_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:\Documents\Projects\tensorflow\models\research\object_detection\model_main.py", line 109, in
tf.app.run()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "E:\Documents\Projects\tensorflow\models\research\object_detection\model_main.py", line 105, in main
tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 447, in train_and_evaluate
return executor.run()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 531, in run
return self.run_local()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 681, in run_local
eval_result, export_results = evaluator.evaluate_and_export()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 886, in evaluate_and_export
hooks=self._eval_spec.hooks)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\estimator.py", line 460, in evaluate
output_dir=self.eval_dir(name))
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1386, in _evaluate_run
config=self._session_config)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\evaluation.py", line 209, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 826, in init
stop_grace_period_secs=stop_grace_period_secs)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 549, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1012, in init
_WrappedSession.init(self, self._create_session())
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1017, in _create_session
return self._sess_creator.create_session()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 706, in create_session
self.tf_sess = self._session_creator.create_session()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 477, in create_session
init_fn=self._scaffold.init_fn)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\session_manager.py", line 281, in prepare_session
config=config)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\session_manager.py", line 195, in _restore_checkpoint
saver.restore(sess, checkpoint_filename_with_path)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 1752, in restore
{self.saver_def.filename_tensor_name: save_path})
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\client\session.py", line 900, in run
run_metadata_ptr)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\client\session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\client\session.py", line 1316, in _do_run
run_metadata)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\client\session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,256,256] rhs shape= [1,1,1280,256]
[[Node: save/Assign_348 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights, save/RestoreV2:348)]]
[[Node: save/RestoreV2/_599 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_728_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
Caused by op 'save/Assign_348', defined at:
File "E:\Documents\Projects\tensorflow\models\research\object_detection\model_main.py", line 109, in
tf.app.run()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "E:\Documents\Projects\tensorflow\models\research\object_detection\model_main.py", line 105, in main
tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 447, in train_and_evaluate
return executor.run()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 531, in run
return self.run_local()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 681, in run_local
eval_result, export_results = evaluator.evaluate_and_export()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\training.py", line 886, in evaluate_and_export
hooks=self._eval_spec.hooks)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\estimator.py", line 460, in evaluate
output_dir=self.eval_dir(name))
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\estimator\estimator.py", line 1386, in _evaluate_run
config=self._session_config)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\evaluation.py", line 209, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 826, in init
stop_grace_period_secs=stop_grace_period_secs)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 549, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1012, in init
_WrappedSession.init(self, self._create_session())
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1017, in _create_session
return self._sess_creator.create_session()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 706, in create_session
self.tf_sess = self._session_creator.create_session()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 468, in create_session
self._scaffold.finalize()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\monitored_session.py", line 212, in finalize
self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 856, in _get_saver_or_default
saver = Saver(sharded=True, allow_empty=True)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 1284, in init
self.build()
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 1296, in build
self._build(self._filename, build_save=True, build_restore=True)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 1333, in _build
build_save=build_save, build_restore=build_restore)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 775, in _build_internal
restore_sequentially, reshape)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 453, in _AddShardedRestoreOps
name="restore_shard"))
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 422, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\training\saver.py", line 113, in restore
self.op.get_shape().is_fully_defined())
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\ops\state_ops.py", line 219, in assign
validate_shape=validate_shape)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 63, in assign
use_locking=use_locking, name=name)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\framework\ops.py", line 3414, in create_op
op_def=op_def)
File "D:\ProgramData\Anaconda3\envs\tfod\lib\site-packages\tensorflow\python\framework\ops.py", line 1740, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1,1,256,256] rhs shape= [1,1,1280,256]
[[Node: save/Assign_348 = Assign[T=DT_FLOAT, _class=["loc:@FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](FeatureExtractor/MobilenetV2/layer_19_1_Conv2d_2_1x1_256/weights, save/RestoreV2:348)]]
[[Node: save/RestoreV2/_599 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_728_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
可以找到我的配置here
还请注意,我最初是posted it on Tensorflow/models/issues,但团队建议我在此处贴上
答案 0 :(得分:0)
在我为CUDA v9升级到TF 1.12和cuDNN 7.3之后,错误消失了
答案 1 :(得分:0)
您的错误是由于图像的尺寸所致,您应该使它们与用于训练的尺寸相等
答案 2 :(得分:0)
这是命令,请根据训练中使用的参数对其进行自定义。
python deeplab / export_model.py --checkpoint_path = / code / models / research / deeplab / weights_input_level_17 / model.ckpt-22000 --export_path = / code / models / research / deeplab / frozen_weights_level_17 / frozen_inference_graph.pb --model_variant =“ xception_65” --atrous_rates = 6 --atrous_rates = 12 --atrous_rates = 18 --output_stride = 16 --crop_size = 2048 --crop_size = 2048 --num_classes = 3
我训练了我的模型来分割桥梁并遵循pascal数据集格式。因此,理想情况下,我只有一个班级,但是由于我们有两个默认班级1.背景2.忽略班级和3.Bridge,所以我的总班级变为3。
嘿,伙计们请使用此配置导出您的deeplabv3plus模式。它对我有用。