Question

我已经使用subprocess.check_output()一段时间来捕获子进程的输出，但在某些情况下会遇到一些性能问题。我在RHEL6机器上运行它。

调用Python环境是linux编译的64位。我执行的子进程是一个shell脚本，最终通过Wine触发Windows python.exe进程（为什么这个愚蠢是另一个故事）。作为shell脚本的输入，我将一小段Python代码传递给python.exe。

虽然系统处于中等/重负载（CPU利用率为40％到70％），但我发现在子进程之后使用subprocess.check_output(cmd, shell=True)会导致显着延迟（最多约45秒）在check_output命令返回之前已完成执行。在此期间查看ps -efH的输出会将被调用的子流程显示为sh <defunct>，直到它最终以正常的零退出状态返回。

相反，使用subprocess.call(cmd, shell=True)在相同的中等/重负载下运行相同的命令将导致子进程立即返回而没有延迟，所有输出都打印到STDOUT / STDERR（而不是从函数调用返回）

为什么只有当check_output()将STDOUT / STDERR输出重定向到其返回值时才有这么大的延迟，而不是call()只是将它打印回父母的STDOUT时/ STDERR？

Answer 1

阅读文档，subprocess.call和subprocess.check_output都是subprocess.Popen的用例。一个小的区别是，如果子进程返回非零退出状态，check_output将引发Python错误。关于check_output（我强调）的一点强调了更大的区别：

完整的函数签名与Popen构造函数的签名大致相同，除了不允许使用stdout，因为它在内部使用。所有其他提供的参数都直接传递给Popen构造函数。

那么stdout如何“内部使用”？让我们比较call和check_output：

呼叫

def call(*popenargs, **kwargs):
    return Popen(*popenargs, **kwargs).wait()

check_output

def check_output(*popenargs, **kwargs):
    if 'stdout' in kwargs:
        raise ValueError('stdout argument not allowed, it will be overridden.')
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
    output, unused_err = process.communicate()
    retcode = process.poll()
    if retcode:
        cmd = kwargs.get("args")
        if cmd is None:
            cmd = popenargs[0]
        raise CalledProcessError(retcode, cmd, output=output)
    return output

通信

现在我们还要看Popen.communicate。这样做，我们注意到，对于一个管道，communicate会执行一些操作，而不仅仅需要花费更多时间来返回Popen().wait() call。

首先，communicate处理stdout=PIPE是否设置了shell=True。显然，call没有。它只是让你的shell喷出任何......使它成为安全风险as Python describes here。

其次，在check_output(cmd, shell=True)（只有一个管道）的情况下......您的子进程发送给stdout的任何内容都由_communicate中的线程处理} 方法。并且Popen必须加入线程（等待它），然后再等待子进程本身终止！

另外，更为简单的是，它将stdout作为list进行处理，然后必须将其合并为字符串。

简而言之，即使参数最小，check_output在Python进程中花费的时间也比call多。

Answer 2

让我们看看代码。 .check_output具有以下等待：

    def _internal_poll(self, _deadstate=None, _waitpid=os.waitpid,
            _WNOHANG=os.WNOHANG, _os_error=os.error, _ECHILD=errno.ECHILD):
        """Check if child process has terminated.  Returns returncode
        attribute.

        This method is called by __del__, so it cannot reference anything
        outside of the local scope (nor can any methods it calls).

        """
        if self.returncode is None:
            try:
                pid, sts = _waitpid(self.pid, _WNOHANG)
                if pid == self.pid:
                    self._handle_exitstatus(sts)
            except _os_error as e:
                if _deadstate is not None:
                    self.returncode = _deadstate
                if e.errno == _ECHILD:
                    # This happens if SIGCLD is set to be ignored or
                    # waiting for child processes has otherwise been
                    # disabled for our process.  This child is dead, we
                    # can't get the status.
                    # http://bugs.python.org/issue15756
                    self.returncode = 0
        return self.returncode

.call使用以下代码等待：

    def wait(self):
        """Wait for child process to terminate.  Returns returncode
        attribute."""
        while self.returncode is None:
            try:
                pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
            except OSError as e:
                if e.errno != errno.ECHILD:
                    raise
                # This happens if SIGCLD is set to be ignored or waiting
                # for child processes has otherwise been disabled for our
                # process.  This child is dead, we can't get the status.
                pid = self.pid
                sts = 0
            # Check the pid and loop as waitpid has been known to return
            # 0 even without WNOHANG in odd situations.  issue14396.
            if pid == self.pid:
                self._handle_exitstatus(sts)
        return self.returncode

请注意与internal_poll相关的bug。它可以在http://bugs.python.org/issue15756查看。几乎就是你遇到的问题。

编辑： .call和.check_output之间的另一个潜在问题是.check_output实际上关心stdin和stdout，并将尝试对两个管道执行IO。如果您遇到一个自身进入僵尸状态的进程，则对处于已解除状态的管道进行读取可能会导致您遇到的挂起。

在大多数情况下，僵尸状态会很快得到清理，但是，如果他们在系统调用中被中断（例如读取或写入），则不会。当然，一旦IO不能再执行，读/写系统调用本身就会被中断，但是，你可能会遇到某种竞争条件，在这种情况下，事情会在错误的顺序中被杀死。

在这种情况下，我能想到确定哪个是原因的唯一方法是，您要么将调试代码添加到子流程文件中，要么调用python调试器并在遇到条件时启动回溯经历。

subprocess.check_output与subprocess.call的性能

2 个答案:

呼叫

check_output

通信