Question

我有一个SGE脚本来执行一些使用qsub提交到队列的python代码。在python脚本中，我有一些打印语句（更新我的程序进度）。当我从命令行运行python脚本时，print语句被发送到stdout。对于sge脚本，我使用-o选项将输出重定向到文件。但是，似乎脚本只会在python脚本运行完毕后将这些文件发送到文件中。这很烦人，因为（a）我不能再看到程序的实时更新和（b）如果我的工作没有正确终止（例如，如果我的工作被踢出队列），则不会打印任何更新。如何在每次打印时都确保脚本正在写入文件，而不是最后将它们整合在一起？

Answer 1

我认为你遇到了缓冲输出的问题。 Python使用一个库来处理它的输出，并且库知道在不与tty交谈时编写块的效率更高。

有几种方法可以解决这个问题。您可以使用“-u”选项运行python（有关详细信息，请参阅python手册页），例如，将此类内容作为脚本的第一行：

#! /usr/bin/python -u

但如果你使用“/ usr / bin / env”技巧，这不起作用，因为你不知道python的安装位置。

另一种方法是用这样的东西重新打开标准输出：

import sys 
import os 

# reopen stdout file descriptor with write mode 
# and 0 as the buffer size (unbuffered) 
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

请注意，os.fdopen的bufsize参数设置为0以强制它无缓冲。您可以使用sys.stderr执行类似的操作。

Answer 2

正如其他人所提到的，当没有连接到tty时，并不总是写stdout是出于性能原因。

如果您有一个特定的点，您希望写入stdout，您可以使用

强制执行该操作

import sys
sys.stdout.flush()

那时。

Answer 3

我刚刚遇到了与SGE类似的问题，没有suggested method来“解除”文件IO似乎对我有用。我不得不等到程序执行结束才能看到任何输出。

我发现的解决方法是将sys.stdout包装到一个重新实现“write”方法的自定义对象中。这个新方法不是实际写入stdout，而是打开IO重定向的文件，附加所需数据，然后关闭文件。它有点难看，但我发现它解决了这个问题，因为文件的实际打开/关闭迫使IO成为交互式的。

这是一个最小的例子：

import os, sys, time

class RedirIOStream:
  def __init__(self, stream, REDIRPATH):
    self.stream = stream
    self.path = REDIRPATH
  def write(self, data):
    # instead of actually writing, just append to file directly!
    myfile = open( self.path, 'a' )
    myfile.write(data)
    myfile.close()
  def __getattr__(self, attr):
    return getattr(self.stream, attr)


if not sys.stdout.isatty():
  # Detect redirected stdout and std error file locations!
  #  Warning: this will only work on LINUX machines
  STDOUTPATH = os.readlink('/proc/%d/fd/1' % os.getpid())
  STDERRPATH = os.readlink('/proc/%d/fd/2' % os.getpid())
  sys.stdout=RedirIOStream(sys.stdout, STDOUTPATH)
  sys.stderr=RedirIOStream(sys.stderr, STDERRPATH)


# Simple program to print msg every 3 seconds
def main():    
  tstart = time.time()
  for x in xrange( 10 ):  
    time.sleep( 3 )
    MSG = '  %d/%d after %.0f sec' % (x, args.nMsg,  time.time()-tstart )
    print MSG

if __name__ == '__main__':
  main()

Answer 4

这是SGE缓冲进程的输出，无论是python进程还是其他任何进程都会发生。

通常，您可以通过更改和重新编译来减少或禁用SGE中的缓冲。但这不是一件好事，所有数据都会慢慢写入磁盘，影响整体性能。

Answer 5

为什么不打印到文件而不是stdout？

outFileID = open('output.log','w')
print(outFileID,'INFO: still working!')
print(outFileID,'WARNING: blah blah!')

并使用

tail -f output.log

Answer 6

这对我有用：

{.section collection}
   {.repeated section collections}
    // rest of JSON logic in here
   {.end}
{.end}

并且问题与NFS有关，在文件关闭或调用fsync之前，不会将数据同步回主服务器。

Answer 7

我今天遇到了同样的问题并通过写入磁盘而不是打印来解决它：

with open('log-file.txt','w') as out:
  out.write(status_report)

SGE脚本：在执行期间打印到文件（不仅仅是在结尾）？

7 个答案: