cudnn句柄未创建,解决方法很明确,如何实现

时间:2019-03-29 07:25:33

标签: tensorflow deep-learning ros cudnn

您好,我正在使用Ubuntu 16.04,ROS动力学,张量流1.13.1。 我的目标是将一台ensenso n35相机及其rosdriver组合到为ROS创建的mask rcnn节点。我已更改了遮罩rcnn节点的原始代码,以使其采用灰度输入并将其堆叠到自身上。我实际上已经通过使用 ensenso摄像机的虚拟版本来验证此功能是否正常。SDK包含一个可对此进行设置的应用程序。它输出白色图像,但是,这对于测试功能不是问题。当我将实际摄像机连接到系统时,问题仍然存在。这给出了以下错误:

2019-03-28 13:30:43.113919: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2019-03-28 13:30:43.872243: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-03-28 13:30:43.874466: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
None
None
Traceback (most recent call last):
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/nodes/mask_rcnn_node", line 182, in <module>
    main()
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/nodes/mask_rcnn_node", line 179, in main
    node.run()
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/nodes/mask_rcnn_node", line 104, in run
    results = self._model.detect([np_image], verbose=0)
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/src/mask_rcnn_ros/model.py", line 2340, in detect
    self.keras_model.predict([molded_images, image_metas], verbose=0)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1790, in predict
    verbose=verbose, steps=steps)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1299, in _predict_loop
    batch_outs = f(ins_batch)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2357, in __call__
    **self.session_kwargs)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1156, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_run
    run_metadata)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1354, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node conv1/convolution (defined at /home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:3195) ]]
     [[node ROI/strided_slice_20 (defined at /home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/src/mask_rcnn_ros/utils.py:687) ]]

Caused by op u'conv1/convolution', defined at:
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/nodes/mask_rcnn_node", line 182, in <module>
    main()
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/nodes/mask_rcnn_node", line 178, in main
    node = MaskRCNNNode()
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/nodes/mask_rcnn_node", line 65, in __init__
    config=config)
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/src/mask_rcnn_ros/model.py", line 1735, in __init__
    self.keras_model = self.build(mode=mode, config=config)
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/src/mask_rcnn_ros/model.py", line 1791, in build
    _, C2, C3, C4, C5 = resnet_graph(input_image, "resnet101", stage5=True)
  File "/home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/src/mask_rcnn_ros/model.py", line 152, in resnet_graph
    x = KL.Conv2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=True)(x)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 603, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/layers/convolutional.py", line 164, in call
    dilation_rate=self.dilation_rate)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 3195, in conv2d
    data_format=tf_data_format)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 851, in convolution
    return op(input, filter)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 966, in __call__
    return self.conv_op(inp, filter)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 591, in __call__
    return self.call(inp, filter)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 208, in __call__
    name=self.name)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1026, in conv2d
    data_format=data_format, dilations=dilations, name=name)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/home/riwo-rack-pc/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node conv1/convolution (defined at /home/riwo-rack-pc/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:3195) ]]
     [[node ROI/strided_slice_20 (defined at /home/riwo-rack-pc/ROS_Mask_rcnn/src/mask_rcnn_ros/src/mask_rcnn_ros/utils.py:687) ]]

在我的一生中,我无法弄清楚这在哪里出错以及为什么。我确保虚拟摄像机输出的数据与实际的相同,但是仅在使用实际摄像机时才会发生错误。

到目前为止,我发现应该在代码中的以下位置添加以下语句,但我无法想到或找到适当的位置:

config_pb2.GPUOptions(allow_growth=True)

我们将不胜感激!另外,如果有人认为最好在其他地方问这个问题,我会把它移到那里。

1 个答案:

答案 0 :(得分:0)

我已经看到您正在使用Mask = Rcnn文档中要求的python = 2.7。

  

python_requires ='> = 3.4',

您应该考虑的其他事项。

如果您要使用gpu,请大声使用tensorflow-gpu。

  

$ pip install tensorflow-gpu