运行TensorBox train.py时出错

时间:2017-03-30 15:00:15

标签: python linux machine-learning tensorflow

我在运行Ubuntu 14.04的64位GPU上配置了TensorBox。 TensorFlow已在机器上设置并完全正常运行。

当我跑步时

python train.py --hypes hypes/lstm_rezoom.json --gpu 0 --logdir output
在TensorBox目录中的

- 应该开始在指定的数据目录上重新训练网络 - 我收到以下错误:

Traceback (most recent call last):
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 85, in __call__
ret = func(*args)
  File "train.py", line 390, in log_image
rnn_len=H['rnn_len'])[0]
  File "~/tensorbox/utils/train_utils.py", line 127, in add_rectangles
    from stitch_wrapper import stitch_rects
ImportError: ~/tensorbox/utils/stitch_wrapper.so: undefined symbol: PyUnicodeUCS2_DecodeUTF8
W tensorflow/core/framework/op_kernel.cc:975] Internal: Failed to run py callback pyfunc_1: see error log.
W tensorflow/core/framework/op_kernel.cc:975] Internal: Failed to run py callback pyfunc_1: see error log.
         [[Node: PyFunc_1 = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_STRING], Tout=[DT_FLOAT], token="pyfunc_1", _device="/job:localhost/replica:0/task:0/cpu:0"](fifo_queue_1_DequeueMany, strided_slice_12, strided_slice_13, Variable/read, PyFunc_1/input_4)]]
W tensorflow/core/framework/op_kernel.cc:975] Internal: Failed to run py callback pyfunc_1: see error log.
         [[Node: PyFunc_1 = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_STRING], Tout=[DT_FLOAT], token="pyfunc_1", _device="/job:localhost/replica:0/task:0/cpu:0"](fifo_queue_1_DequeueMany, strided_slice_12, strided_slice_13, Variable/read, PyFunc_1/input_4)]]
Traceback (most recent call last):
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 85, in __call__
    ret = func(*args)
  File "train.py", line 390, in log_image
    rnn_len=H['rnn_len'])[0]
  File "~/tensorbox/utils/train_utils.py", line 127, in add_rectangles
    from stitch_wrapper import stitch_rects
ImportError: ~/tensorbox/utils/stitch_wrapper.so: undefined symbol: PyUnicodeUCS2_DecodeUTF8
W tensorflow/core/framework/op_kernel.cc:975] Internal: Failed to run py callback pyfunc_0: see error log.
W tensorflow/core/kernels/queue_base.cc:294] _0_fifo_queue: Skipping cancelled enqueue attempt with queue not closed
W tensorflow/core/kernels/queue_base.cc:294] _1_fifo_queue_1: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
  File "train.py", line 549, in <module>
    main()
  File "train.py", line 546, in main
    train(H, test_images=[])
  File "train.py", line 504, in train
    ], feed_dict=lr_feed)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
    feed_dict_string, options, run_metadata)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
    target_list, options, run_metadata)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Failed to run py callback pyfunc_1: see error log.
         [[Node: PyFunc_1 = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_STRING], Tout=[DT_FLOAT], token="pyfunc_1", _device="/job:localhost/replica:0/task:0/cpu:0"](fifo_queue_1_DequeueMany, strided_slice_12, strided_slice_13, Variable/read, PyFunc_1/input_4)]]

Caused by op u'PyFunc_1', defined at:
  File "train.py", line 549, in <module>
    main()
  File "train.py", line 546, in main
    train(H, test_images=[])
  File "train.py", line 448, in train
    smooth_op, global_step, learning_rate) = build(H, q)
  File "train.py", line 402, in build
    [tf.float32])
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 192, in py_func
    input=inp, token=token, Tout=Tout, name=name)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_script_ops.py", line 40, in _py_func
    name=name)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "~/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
    self._traceback = _extract_stack()

InternalError (see above for traceback): Failed to run py callback pyfunc_1: see error log.
         [[Node: PyFunc_1 = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_STRING], Tout=[DT_FLOAT], token="pyfunc_1", _device="/job:localhost/replica:0/task:0/cpu:0"](fifo_queue_1_DequeueMany, strided_slice_12, strided_slice_13, Variable/read, PyFunc_1/input_4)]]

我似乎无法弄清楚问题所在。我已经检查过它访问的文件(例如.so)实际上是在正确的目录路径中,并且我尝试过对其他人建议的train.py文件本身的一些修改帮助页面(例如更改 state_is_tuple 在第39和41行到False) - 但它们似乎不是问题的原因。

0 个答案:

没有答案