子进程当Popen出错时,Popen关闭另一个线程中使用的stdout / stderr文件描述符

时间:2013-08-26 08:22:59

标签: python multithreading python-2.7

当我们从Python 2.7.3升级到Python 2.7.5时,大量使用subprocess.Popen()的内部库开始失败其自动化测试。该库用于线程环境。在调试问题之后,我能够创建一个简短的Python脚本来演示在失败的测试中看到的错误。

这是脚本(称为“threadedsubprocess.py”):

import time
import threading
import subprocess

def subprocesscall():
    p = subprocess.Popen(
        ['ls', '-l'],
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        )
    time.sleep(2) # simulate the Popen call takes some time to complete.
    out, err = p.communicate()
    print 'succeeding command in thread:', threading.current_thread().ident

def failingsubprocesscall():
    try:
        p = subprocess.Popen(
            ['thiscommandsurelydoesnotexist'],
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            )
    except Exception as e:
        print 'failing command:', e, 'in thread:', threading.current_thread().ident

print 'main thread is:', threading.current_thread().ident

subprocesscall_thread = threading.Thread(target=subprocesscall)
subprocesscall_thread.start()
failingsubprocesscall()
subprocesscall_thread.join()

注意:从Python 2.7.3运行时,此脚本不会以IOError退出。从Python 2.7.5运行时(至少在同一个Ubuntu 12.04 64位VM上运行),它至少失败了50%。

Python 2.7.5引发的错误是:

/opt/python/2.7.5/bin/python ./threadedsubprocess.py 
main thread is: 139899583563520
failing command: [Errno 2] No such file or directory 139899583563520
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/opt/python/2.7.5/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/opt/python/2.7.5/lib/python2.7/threading.py", line 761, in run
    self.__target(*self.__args, **self.__kwargs)
  File "./threadedsubprocess.py", line 13, in subprocesscall
    out, err = p.communicate()
  File "/opt/python/2.7.5/lib/python2.7/subprocess.py", line 806, in communicate
    return self._communicate(input)
  File "/opt/python/2.7.5/lib/python2.7/subprocess.py", line 1379, in _communicate
    self.stdin.close()
IOError: [Errno 9] Bad file descriptor

close failed in file object destructor:
IOError: [Errno 9] Bad file descriptor

当比较Python 2.7.3中的子进程模块和Python 2.7.5时,我看到Popen()的__init __()调用确实现在显式关闭了stdin,stdout和stderr文件描述符,以防执行命令以某种方式失败。这似乎是Python 2.7.4中应用的预期修复,以防止泄漏文件描述符(http://hg.python.org/cpython/file/ab05e7dd2788/Misc/NEWS#l629)。

Python 2.7.3和Python 2.7.5之间似乎与此问题相关的差异在于Popen __init __():

@@ -671,12 +702,33 @@
          c2pread, c2pwrite,
          errread, errwrite) = self._get_handles(stdin, stdout, stderr)

-        self._execute_child(args, executable, preexec_fn, close_fds,
-                            cwd, env, universal_newlines,
-                            startupinfo, creationflags, shell,
-                            p2cread, p2cwrite,
-                            c2pread, c2pwrite,
-                            errread, errwrite)
+        try:
+            self._execute_child(args, executable, preexec_fn, close_fds,
+                                cwd, env, universal_newlines,
+                                startupinfo, creationflags, shell,
+                                p2cread, p2cwrite,
+                                c2pread, c2pwrite,
+                                errread, errwrite)
+        except Exception:
+            # Preserve original exception in case os.close raises.
+            exc_type, exc_value, exc_trace = sys.exc_info()
+
+            to_close = []
+            # Only close the pipes we created.
+            if stdin == PIPE:
+                to_close.extend((p2cread, p2cwrite))
+            if stdout == PIPE:
+                to_close.extend((c2pread, c2pwrite))
+            if stderr == PIPE:
+                to_close.extend((errread, errwrite))
+
+            for fd in to_close:
+                try:
+                    os.close(fd)
+                except EnvironmentError:
+                    pass
+
+            raise exc_type, exc_value, exc_trace

我想我有三个问题:

1)是否真的应该在线程环境中使用subprocess.Popen,pIPE用于stdin,stdout和stderr?

2)当Popen()在其中一个线程中失败时,如何防止stdin,stdout和stderr的文件描述符被关闭?

3)我在这里做错了吗?

2 个答案:

答案 0 :(得分:7)

我想回答你的问题:

  1. 你不应该这样做。
  2. 没有
  3. 错误也发生在Python 2.7.4中。

    我认为这是库代码中的一个错误。如果在程序中添加锁定并确保以原子方式执行对subprocess.Popen的两次调用,则不会发生错误。

    @@ -1,32 +1,40 @@
     import time
     import threading
     import subprocess
    
    +lock = threading.Lock()
    +
     def subprocesscall():
    +    lock.acquire()
         p = subprocess.Popen(
             ['ls', '-l'],
             stdin=subprocess.PIPE,
             stdout=subprocess.PIPE,
             stderr=subprocess.PIPE,
             )
    +    lock.release()
         time.sleep(2) # simulate the Popen call takes some time to complete.
         out, err = p.communicate()
         print 'succeeding command in thread:', threading.current_thread().ident
    
     def failingsubprocesscall():
         try:
    +        lock.acquire()
             p = subprocess.Popen(
                 ['thiscommandsurelydoesnotexist'],
                 stdin=subprocess.PIPE,
                 stdout=subprocess.PIPE,
                 stderr=subprocess.PIPE,
                 )
         except Exception as e:
             print 'failing command:', e, 'in thread:', threading.current_thread().ident
    +    finally:
    +        lock.release()
    +
    
     print 'main thread is:', threading.current_thread().ident
    
     subprocesscall_thread = threading.Thread(target=subprocesscall)
     subprocesscall_thread.start()
     failingsubprocesscall()
     subprocesscall_thread.join()
    

    这意味着很可能是由于Popen实施中的一些数据竞争。我将冒险猜测:该错误可能在pipe_cloexec的实施中,由_get_handles调用,其中(在2.7.4中)是:

    def pipe_cloexec(self):
        """Create a pipe with FDs set CLOEXEC."""
        # Pipes' FDs are set CLOEXEC by default because we don't want them
        # to be inherited by other subprocesses: the CLOEXEC flag is removed
        # from the child's FDs by _dup2(), between fork() and exec().
        # This is not atomic: we would need the pipe2() syscall for that.
        r, w = os.pipe()
        self._set_cloexec_flag(r)
        self._set_cloexec_flag(w)
        return r, w
    

    并且评论明确警告它不是原子的......这肯定会导致数据竞争,但是,如果没有实验,我不知道它是否是导致问题的原因。

答案 1 :(得分:0)

其他解决方案,如果您没有处理打开的文件(例如,在构建API时)。

我通过执行windll API调用找到了解决问题的方法,将所有已打开的文件描述符标记为"不可继承"。这有点像黑客,Q& A可以在这里找到:

Howto: workaround of close_fds=True and redirect stdout/stderr on windows

它将绕过Python 2.7错误。

其他解决方案是使用Python 3.4+ :)它已被修复