使用Nvidia GTX650 Ti(2GB)运行Tensorflow对象检测?

时间:2017-09-13 20:11:11

标签: object-detection tensorflow-gpu low-memory

有没有办法让用2GB显卡运行对象检测? 主板上有24GB DD3 Ram,我不能用GPU吗?

我确实尝试在 trainer.py 中添加 session_config.gpu_options.allow_growth = True ,但这没有帮助。 它似乎图形卡没有足够的内存。

CardInfos:

0, name: GeForce GTX 650, pci bus id: 0000:01:00.0)
[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4876955943962853047
, name: "/gpu:0"
device_type: "GPU"
memory_limit: 1375862784
locality {
  bus_id: 1
}
incarnation: 4236842880144430162
physical_device_desc: "device: 0, name: GeForce GTX 650, pci bus id: 0000:01:00.0"
]

train.py输出:

Limit:                   219414528
InUse:                   192361216
MaxInUse:                192483072
NumAllocs:                    6030
MaxAllocSize:              6131712

2017-09-13 13:47:13.429510: W tensorflow/core/common_runtime/bfc_allocator.cc:277] ****************************************************************************************____________
2017-09-13 13:47:13.481829: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InternalError'>, Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
2017-09-13 13:47:13.955327: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_299 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3432_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
2017-09-13 13:47:13.956056: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_299 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3432_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1327, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1306, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 198, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "train.py", line 194, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "/home/dee/Documents/projects/tensor/models/object_detection/trainer.py", line 297, in train
    saver=saver)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 755, in train
    sess, train_op, global_step, train_step_kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 488, in train_step
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.

1 个答案:

答案 0 :(得分:0)

确实Dst tensor is not initialized消息表明您的GPU内存不足。您可以尝试将批量大小减小到最小,并降低您输入模型的图像的分辨率。还尝试使用SSD Mobilenet Model,因为它非常轻巧。

回答问题的第二部分: 我一直认为现代GPU将进入混合模式,其中驱动程序/ GPU通过PCIe总线从系统RAM开始流式传输资源,以弥补“缺失”#34; VRAM。由于系统RAM比GDDR5慢3-5倍且具有更高的延迟,因此VRAM的耗尽将导致显着的性能损失。不过我在配备6GB VRAM的GTX 1060上遇到了同样的问题,因为CUDA进程因GPU耗尽而崩溃。