Tensorflow对象检测API培训错误“ TypeError:'Mul'Op的输入'y'具有类型float32

时间:2018-08-21 19:46:54

标签: python tensorflow object-detection-api nvidia-jetson

EDIT2

好,到目前为止,我已经尝试使用python3.5 -tf 1.10和python 2.7 tf 1.10

我仍然遇到此错误

Traceback (most recent call last):
  File "object_detection/model_main.py", line 101, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "object_detection/model_main.py", line 97, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 455, in train_and_evaluate
    return executor.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 594, in run
    return self.run_local()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 695, in run_local
    saving_listeners=saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 354, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1179, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1209, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1167, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/nvidia/tensorflow/models/research/object_detection/model_lib.py", line 287, in model_fn
    prediction_dict, features[fields.InputDataFields.true_image_shape])
  File "/home/nvidia/tensorflow/models/research/object_detection/meta_architectures/ssd_meta_arch.py", line 686, in loss
    keypoints, weights)
  File "/home/nvidia/tensorflow/models/research/object_detection/meta_architectures/ssd_meta_arch.py", line 859, in _assign_targets
    groundtruth_weights_list)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/target_assigner.py", line 481, in batch_assign_targets
    anchors, gt_boxes, gt_class_targets, unmatched_class_label, gt_weights)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/target_assigner.py", line 180, in assign
    match = self._matcher.match(match_quality_matrix, **params)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/matcher.py", line 239, in match
    return Match(self._match(similarity_matrix, **params),
  File "/home/nvidia/tensorflow/models/research/object_detection/matchers/argmax_matcher.py", line 190, in _match
    _match_when_rows_are_non_empty, _match_when_rows_are_empty)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2074, in cond
    orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1920, in BuildCondBranch
    original_result = fn()
  File "/home/nvidia/tensorflow/models/research/object_detection/matchers/argmax_matcher.py", line 153, in _match_when_rows_are_non_empty
    -1)
  File "/home/nvidia/tensorflow/models/research/object_detection/matchers/argmax_matcher.py", line 203, in _set_values_using_indicator
    indicator = tf.cast(1-indicator, x.dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 878, in r_binary_op_wrapper
    x = ops.convert_to_tensor(x, dtype=y.dtype.base_dtype, name="x")
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1028, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1124, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 228, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 207, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 442, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 353, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected bool, got 1 of type 'int' instead.

有人尝试过在TX2上进行培训吗?还是仅针对我的情况,我做错了什么?

原始

尝试在Jetson TX2上的mobilenet ssd上进行训练(我知道这不是为了保留,但我没有更好的选择) 遵循了这些指南

https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_locally.md

培训可以在笔记本电脑(CPU)上正常运行,但是我在TX2上收到以下错误消息

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 510, in _apply_op_helper
    preferred_dtype=default_dtype)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1040, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 883, in _TensorTensorConversionFunction
    (dtype.name, t.dtype.name, str(t)))
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("Loss/Loss/huber_loss/Sub_1:0", shape=(24, 1917, 4), dtype=float32)'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "object_detection/model_main.py", line 101, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "object_detection/model_main.py", line 97, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 425, in train_and_evaluate
    executor.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 504, in run
    self.run_local()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 636, in run_local
    hooks=train_hooks)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 355, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 824, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 805, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/nvidia/tensorflow/models/research/object_detection/model_lib.py", line 287, in model_fn
    prediction_dict, features[fields.InputDataFields.true_image_shape])
  File "/home/nvidia/tensorflow/models/research/object_detection/meta_architectures/ssd_meta_arch.py", line 708, in loss
    weights=batch_reg_weights)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/losses.py", line 74, in __call__
    return self._compute_loss(prediction_tensor, target_tensor, **params)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/losses.py", line 157, in _compute_loss
    reduction=tf.losses.Reduction.NONE
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/losses/losses_impl.py", line 444, in huber_loss
    math_ops.multiply(delta, linear))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 326, in multiply
    return gen_math_ops.mul(x, y, name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 4689, in mul
    "Mul", x=x, y=y, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 546, in _apply_op_helper
    inferred_from[input_arg.type_attr]))
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.

注意: 使用预编译的轮子安装Tensorflow protobuf编译器发生错误,已解决 删除这条线 保留6; (第104行) 在object_detection / protos文件夹中的ssd.proto中 我在这里找到了解决方案,但找不到链接

这是开始训练的脚本

PIPELINE_CONFIG_PATH=/home/nvidia/testtraining/models/model/ssd_mobilenet_v1_pets.config
MODEL_DIR=/home/nvidia/testtraining/models/model/
NUM_TRAIN_STEPS=50000
NUM_EVAL_STEPS=2000
python3 object_detection/model_main.py \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
    --model_dir=${MODEL_DIR} \
    --num_train_steps=${NUM_TRAIN_STEPS} \
    --num_eval_steps=${NUM_EVAL_STEPS} \
    --alsologtostderr

笔记本电脑TF版本1.10.0

Jetson TX2 tf版本1.6.0-rc1

我是Ubuntu和Tensorflow的新手,所以对我轻松一点:)

谢谢

编辑: 似乎_apply_op_helper中的546行是某种错误处理行。 我尝试通过以下编辑修复此错误。添加了这些。在/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py中将它们添加到第236行中,在define语句之后

import tensorflow as tf
y = tf.cast(y, x.dtype)   

这创建了其他错误消息,可以通过将/home/nvidia/tensorflow/models/research/object_detection/matchers/argmax_matcher.py第203-204行编辑为这些错误消息

indicator = tf.cast(1-indicator, x.dtype)
return tf.add(tf.multiply(x, indicator), val * indicator)

但是我仍然遇到错误

Traceback (most recent call last):
  File "object_detection/model_main.py", line 101, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "object_detection/model_main.py", line 97, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 425, in train_and_evaluate
    executor.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 504, in run
    self.run_local()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/training.py", line 636, in run_local
    hooks=train_hooks)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 355, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 824, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 805, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/nvidia/tensorflow/models/research/object_detection/model_lib.py", line 287, in model_fn
    prediction_dict, features[fields.InputDataFields.true_image_shape])
  File "/home/nvidia/tensorflow/models/research/object_detection/meta_architectures/ssd_meta_arch.py", line 686, in loss
    keypoints, weights)
  File "/home/nvidia/tensorflow/models/research/object_detection/meta_architectures/ssd_meta_arch.py", line 859, in _assign_targets
    groundtruth_weights_list)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/target_assigner.py", line 481, in batch_assign_targets
    anchors, gt_boxes, gt_class_targets, unmatched_class_label, gt_weights)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/target_assigner.py", line 180, in assign
    match = self._matcher.match(match_quality_matrix, **params)
  File "/home/nvidia/tensorflow/models/research/object_detection/core/matcher.py", line 239, in match
    return Match(self._match(similarity_matrix, **params),
  File "/home/nvidia/tensorflow/models/research/object_detection/matchers/argmax_matcher.py", line 190, in _match
    _match_when_rows_are_non_empty, _match_when_rows_are_empty)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2047, in cond
    orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1897, in BuildCondBranch
    original_result = fn()
  File "/home/nvidia/tensorflow/models/research/object_detection/matchers/argmax_matcher.py", line 153, in _match_when_rows_are_non_empty
    -1)
  File "/home/nvidia/tensorflow/models/research/object_detection/matchers/argmax_matcher.py", line 203, in _set_values_using_indicator
    indicator = tf.cast(1-indicator, x.dtype)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 983, in r_binary_op_wrapper
    x = ops.convert_to_tensor(x, dtype=y.dtype.base_dtype, name="x")
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 950, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1040, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 235, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 214, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 433, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 344, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected bool, got 1 of type 'int' instead.

这是我的胆识

我认为存在巨大的兼容性问题,而我只会安装tf 1.1 我很乐意接受新想法

1 个答案:

答案 0 :(得分:0)

您的问题的关键在这里(追溯的底部):

   TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.

y的类型为float32,但是参数-x(将y传递到的位置)必须为int32。

尝试使用tf.cast(y,tf.int32)或类似的东西。

有时tf会有所更改/您使用的是旧型号。因此,这可能会不时发生。

所以打开  文件“ /usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py”

找到第546行,然后进行强制转换。 (我想使用sudo vim)