Tensorflow-gpu运行问题

时间:2020-02-06 23:05:39

标签: python tensorflow

使用:

  • Tensorflow-gpu 2.0.0
  • Windows 10环境
  • NVIDIA GTX 1050 gpu
  • cuda 10.0和对应的cudnn 7.6.5

我遵循了TF-gpu的官方TF文档,并且尝试创建并拟合一个简单的CNN模型(在a.py文件中,我尝试使用jupyter,但内核始终死机),但是我得到了以下:

2020-02-06 23:57:14.420911: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-02-06 23:57:16.081396: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-02-06 23:57:16.861094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
2020-02-06 23:57:16.861492: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-02-06 23:57:16.862290: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2020-02-06 23:58:14.322053: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-02-06 23:58:14.324900: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed to get device attribute 13 for device 0: CUDA_ERROR_UNKNOWN: unknown error

有人知道如何使tf-gpu 2.0.0正常运行吗?我已经用2.1.0进行了测试,但是问题似乎仍然存在。

2 个答案:

答案 0 :(得分:0)

即使在Github中也提供了解决方案,还是为了Stackoverflow社区的利益。

通过以下组合安装Tensorflow_GPU : 2.1.0已解决CUDA_ERROR_UNKNOWN问题。

  • Python版本:3.7.6
  • 编译器:MSVC 2017
  • CUDA:10.1
  • cuDNN:7.6.5

请参考Windows CPUGPU的经过测试的内部配置。

答案 1 :(得分:0)

在我的情况下,如here所述,限制了GPU内存的工作。

在代码中添加以下内容:

import tensorflow as tf

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)