Question

我有一个包含100行的示例文件，我正在使用cat子进程阅读。但是，队列中的输出始终被截断。我怀疑这可能是由于cat缓冲了它的输出，因为它检测到了一个管道。

p = subprocess.Popen("cat file.txt",
                     stdout=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     stdin=subprocess.PIPE,
                     shell=True,
                     bufsize=0)

我使用单独的线程来读取cat的stdout和stderr管道：

def StdOutThread():
  while not p.stdout.closed and running:
    line = ""
    while not line or line[-1] != "\n":
      r = p.stdout.read(1)
      if not r:
        break
      line += r
      pending_line["line"] = line

    if line and line[-1] == "\n":
      line = line[:-1]
    if line:
      queue.put(("out", line))

启动这些线程并将它们读取的内容转储到队列中。当cat还活着时，主线程从这个队列中读取。

with CancelFunction(p.kill):
    try:
      stdout_thread = threading.Thread(target=StdOutThread)
      stdout_thread.start()
      while p.poll() is None:
        ReadFromQueue()
      while not queue.empty():
        ReadFromQueue()  
    finally:
      running = False
      stdout_thread.join()

我已经考虑过使用pexpect来克服这个问题，但同时也想要区分stdout和stderr，而这似乎与pexpect无关。非常感谢帮助。

Answer 1

我确定你的主线程在cat的所有输出都被读取并放入队列之前退出了try块。

请注意，即使您没有读取其所有输出，cat也可以退出。考虑这一系列事件：

cat写出最后一行
cat退出
在读者线程进行更改以读取cat的最后一位输出之前，主线程检测到cat已退出（通过p.poll()）
然后主线程退出try块并将running设置为false
读取器线程退出，因为running为false，但在此之前最后一条输入已被阅读。

下面是一个更简单的方法，它使用队列中的 sentinel 值通知主线程读者线程已退出。

如果cat退出，那么它最终会在管道上达到EOF 监控。当发生这种情况时，它会将None置于队列中通知主线程已完成。当两个读者线程都有完成主线程可以安全地停止监视队列和加入主题。

import threading
import subprocess
import os
import time
import Queue
import sys

def pipe_thread(queue, name, handle):
  print "in handlehandle"
  for line in handle:
    if line[-1] == "\n":
      line = line[:-1]
    queue.put( (name, line) )
  queue.put(None)

def main():
    p = subprocess.Popen("cat file.txt",
                         stdout=subprocess.PIPE,
                         stderr=subprocess.PIPE,
                         stdin=subprocess.PIPE,
                         shell=True,
                         bufsize=0)

    queue = Queue.Queue()

    t1 = threading.Thread(target = pipe_thread,
                             args = [queue, "stdout", p.stdout])
    t2 = threading.Thread(target = pipe_thread,
                             args = [queue, "stderr", p.stderr])

    t1.start()
    t2.start()

    alive = 2
    count = 0
    while alive > 0:
      item = queue.get()
      if item == None:
        alive = alive - 1
      else:
        (which, line) = item
        count += 1
        print count, "got from", which, ":", line
    print "joining..."
    t1.join()
    t2.join()

main()

缓冲进程输出导致截断？

1 个答案: