SLURM,使用srun打印输出

时间:2019-05-15 12:02:26

标签: slurm

我正在使用srun运行我的程序,但是,它无法打印输出。

me@home:~$ srun -p K80q --gres=gpu:1 -N 1 python3 main.py 
2019-05-15 19:56:43.305156: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-05-15 19:56:43.543516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:85:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2019-05-15 19:56:43.543567: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2019-05-15 19:56:43.900189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-15 19:56:43.900248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2019-05-15 19:56:43.900257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2019-05-15 19:56:43.900619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10761 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:85:00.0, compute capability: 3.7)

我只得到上面的输出,它不能打印我期望的信息。我该如何解决?

顺便说一句,只需定义一个测试代码

import tensorflow 

if __name__ == '__main__':
    for i in range(10):
        print('Hello')

它可以打印Hello 10次。

更新:

20分钟后,它会输出一些我期望的信息。如何立即输出?

1 个答案:

答案 0 :(得分:1)

尝试使用-u的{​​{1}}选项:

  

-u,--unbuffered                 默认情况下,slurmstepd和用户启动的应用程序之间的连接是通过管道进行的。编写的stdio输出   该应用程序是                 由glibc缓冲,直到刷新或输出设置为未缓冲为止。参见setbuf(3)。如果指定此选项,   任务执行                 带有伪终端,以便应用程序输出不受缓冲。