Tensorflow应用程序冻结在docker容器中

时间:2017-07-12 22:23:04

标签: docker tensorflow

我有一个在ubuntu 16.04中运行良好的tensorflow应用程序但是当我尝试在具有nvidia-docker的tensorflow / tensorflow docker镜像中运行它时,它会到达这一点然后冻结:

2017-07-12 22:06:10.917255: W 
tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow 
library wasn't compiled to use SSE4.1 instructions, but these are 
available on your machine and could speed up CPU computations.
2017-07-12 22:06:10.917289: W 
tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow 
library wasn't compiled to use SSE4.2 instructions, but these are 
available on your machine and could speed up CPU computations.
2017-07-12 22:06:11.023765: I 
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful 
NUMA node read from SysFS had negative value (-1), but there must be 
at least one NUMA node, so returning NUMA node zero
2017-07-12 22:06:11.024133: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 
with properties: 
name: Quadro M4000
major: 5 minor: 2 memoryClockRate (GHz) 0.7725
pciBusID 0000:00:05.0
Total memory: 7.93GiB
Free memory: 7.87GiB
2017-07-12 22:06:11.024159: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-07-12 22:06:11.024168: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-07-12 22:06:11.024190: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating 
TensorFlow device (/gpu:0) -> (device: 0, name: Quadro M4000, pci 
bus id: 0000:00:05.0)

由于它没有输出错误信息,我不知道从哪里开始;对于我可能遗漏的任何建议或进一步排除故障的步骤?

我确认我的nvidia-docker安装工作正常。

1 个答案:

答案 0 :(得分:1)

事实证明应用程序正在运行,它似乎已经冻结,因为在docker容器中运行的python应用程序的输出往往会卡在缓冲区中,并且永远不会显示在docker日志中。为了解决这个问题,我将-u传递给了python - 我现在可以在docker日志中看到我的应用程序输出,一切都很顺利。