能否请您帮我调试此问题。我尝试使用多个版本进行构建,但无法解决。
我的配置:
硬件: MacBook Pro 13,3 eGPU NVIDIA 1080
软件:
macOS 10.13.6
NVIDIA Web驱动程序387.10.10.10.40.105
CUDA驱动程序396.148
CUDA 9.1工具包
cuDNN 7.0.5
Python 2.7
NCCL 2.1.15
Xcode 9.2
Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] OS X does not support NUMA - returning NUMA node zero
tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:46:00.0
totalMemory: 8.00GiB freeMemory: 3.39GiB
tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3118 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:46:00.0, compute capability: 6.1)
E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
Segmentation fault: 11
当我尝试运行某些程序时,最终得到以下错误消息,并带有segfault:
layoutSubviews
在其他程序中,我尝试减小per_process_gpu_memory_fraction和批处理大小,以便在具有相同错误代码的第一批处理后崩溃。