Question

我有一个Python脚本，使用subprocess.Popen调用另一个Python脚本。我知道被调用的代码总是返回10，这意味着它失败了。

我的问题是，来电者在75％的时间内只能读取10个。另外25％它读取0并且错误地将被叫程序失败代码视为成功。相同的命令，相同的环境，显然是随机出现。

环境：Python 2.7.10，Linux Redhat 6.4。这里提供的代码是（非常）简化版本，但我仍然可以使用它重现问题。

这是被调用的脚本，constant_return.py：

#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-

"""
Simplified called code
"""
import sys

if __name__ == "__main__":
    sys.exit(10)

这是来电者代码：

#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-

"""
Simplified version of the calling code
"""

try:
    import sys
    import subprocess
    import threading

except Exception, eImp:
    print "Error while loading Python library : %s" % eImp
    sys.exit(100)


class BizarreProcessing(object):
    """
    Simplified caller class
    """

    def __init__(self):
        """
        Classic initialization
        """
        object.__init__(self)


    def logPipe(self, isStdOut_, process_):
        """
        Simplified log handler
        """
        try:
            if isStdOut_:
                output = process_.stdout
                logfile = open("./log_out.txt", "wb")
            else:
                output = process_.stderr
                logfile = open("./log_err.txt", "wb")

            #Read pipe content as long as the process is running
            while (process_.poll() == None):
                text = output.readline()
                if (text != '' and text.strip() != ''):
                    logfile.write(text)

        #When the process is finished, there might still be lines remaining in the pipe
            output.readlines()
            for oneline in output.readlines():
                if (oneline != None and oneline.strip() != ''):
                    logfile.write(text)
        finally:
            logfile.close()


    def startProcessing(self):
        """
        Launch process
        """

        # Simplified command line definition
        command = "/absolute/path/to/file/constant_return.py"

        # Execute command in a new process
        process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

        #Launch a thread to gather called programm stdout and stderr
        #This to avoid a deadlock with pipe filled and such
        stdoutTread = threading.Thread(target=self.logPipe, args=(True, process))
        stdoutTread.start()
        stderrThread = threading.Thread(target=self.logPipe, args=(False, process))
        stderrThread.start()

        #Wait for the end of the process and get process result
        stdoutTread.join()
        stderrThread.join()
        result = process.wait()

        print("returned code: " + str(result))

        #Send it back to the caller
        return (result)


#
# Main
#
if __name__ == "__main__":

    # Execute caller code
    processingInstance = BizarreProcessing()
    aResult = processingInstance.startProcessing()

    #Return the code
    sys.exit(aResult)

以下是我在bash中输入的用于执行调用者脚本的内容：

for res in {1..100}
do
    /path/to/caller/script.py
    echo $? >> /tmp/returncodelist.txt
done

它似乎以某种方式连接到我读取被调用程序输出的方式，因为当我使用process = subprocess.Popen(command, shell=True, stdout=sys.stdout, stderr=sys.stderr)创建子进程并删除所有Thread内容时，它会读取正确的返回代码（但不会记录为我想要了......）

知道我做错了吗？

非常感谢你的帮助

Answer 1

logPipe还会检查进程是否处于活动状态，以确定是否还有更多数据需要读取。这是不正确的 - 您应该通过查找零长度读取或使用output.readlines（）来检查管道是否已达到EOF。 I / O管道可能比该过程更长。

这显着简化了logPipe：更改logPipe如下：

  def logPipe(self, isStdOut_, process_):
      """
      Simplified log handler
      """
      try:
          if isStdOut_:
              output = process_.stdout
              logfile = open("./log_out.txt", "wb")
          else:
              output = process_.stderr
              logfile = open("./log_err.txt", "wb")

          #Read pipe content as long as the process is running
          with output:
              for text in output:
                  if text.strip(): # ... checks if it's not an empty string
                      logfile.write(text)

      finally:
          logfile.close()

其次，在 after.wait（）之后，不要加入日志记录线程，因为同样的原因 - I / O管道可能比这个过程更长。

我认为在幕后发生的事情是，SIGPIPE在某处发出并处理不当 - 可能被误解为过程终止条件。这是因为管道在一端或另一端被关闭而没有被冲洗。 SIGPIPE有时会在较大的应用程序中造成麻烦;可能是Python库吞下它或用它做一些幼稚的东西。

编辑正如@Blackjack指出的那样，Python会自动阻止SIGPIPE。因此，这排除了SIGPIPE的渎职行为。第二个理论：Popen.poll（）背后的文档说明：

检查子进程是否已终止。设置并返回返回码属性。

如果你对此进行过滤（例如，strace -f -o strace.log ./caller.py），这似乎是通过wait4（WNOHANG）完成的。你有2个线程正在等待WNOHANG，一个正常等待，但只有一个调用将正确返回进程退出代码。如果subprocess.poll（）的实现中没有锁定，则很可能会分配process.resultcode，或者可能无法正确执行此操作。将Popen.waits / polls限制为单个线程应该是避免这种情况的好方法。请参阅man waitpid。
抛开
编辑，如果您可以将所有stdout / stderr数据保存在内存中，则subprocess.communicate（）更容易使用，并且根本不需要logPipe或后台线程。

https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate

Python子进程：读取返回代码有时与返回的代码

1 个答案: