Question

在shell脚本中，我们有以下命令：

/script1.pl < input_file| /script2.pl > output_file

我想使用模块subprocess在Python中复制上面的流。 input_file是一个大文件，我无法一次读取整个文件。因此，我想将每一行input_string传递到管道流中并返回一个字符串变量output_string，直到整个文件都已流式传输。

以下是第一次尝试：

process = subprocess.Popen(["/script1.pl | /script2.pl"], stdin = subprocess.PIPE, stdout = subprocess.PIPE, shell = True)
process.stdin.write(input_string)
output_string = process.communicate()[0]

但是，使用process.communicate()[0]关闭流。我想保持流打开以用于将来的流。我尝试使用process.stdout.readline()，但程序挂起。

Answer 1

使用Python中的/script1.pl < input_file | /script2.pl > output_file模块模拟subprocess shell命令：

#!/usr/bin/env python
from subprocess import check_call

with open('input_file', 'rb') as input_file
    with open('output_file', 'wb') as output_file:
        check_call("/script1.pl | /script2.pl", shell=True,
                   stdin=input_file, stdout=output_file)

你可以在没有shell=True的情况下编写它（虽然我没有看到这里的理由）基于17.1.4.2. Replacing shell pipeline example from the docs：

#!/usr/bin/env python
from subprocess import Popen, PIPE

with open('input_file', 'rb') as input_file
    script1 = Popen("/script1.pl", stdin=input_file, stdout=PIPE)
with open("output_file", "wb") as output_file:
    script2 = Popen("/script2.pl", stdin=script1.stdout, stdout=output_file)
script1.stdout.close() # allow script1 to receive SIGPIPE if script2 exits
script2.wait()
script1.wait()

您也可以使用plumbum module to get shell-like syntax in Python：

#!/usr/bin/env python
from plumbum import local

script1, script2 = local["/script1.pl"], local["/script2.pl"]
(script1 < "input_file" | script2 > "output_file")()

另见How do I use subprocess.Popen to connect multiple processes by pipes?

如果要逐行读/写，则答案取决于您要运行的具体脚本。一般情况下，如果由于buffering issues而不小心，很容易发送/接收输入/输出死锁。

如果输入不依赖于您的输出，那么可靠的跨平台方法是为每个流使用单独的线程：

#!/usr/bin/env python
from subprocess import Popen, PIPE
from threading import Thread

def pump_input(pipe):
    try:
       for i in xrange(1000000000): # generate large input
           print >>pipe, i
    finally:
       pipe.close()

p = Popen("/script1.pl | /script2.pl", shell=True, stdin=PIPE, stdout=PIPE,
          bufsize=1)
Thread(target=pump_input, args=[p.stdin]).start()
try: # read output line by line as soon as the child flushes its stdout buffer
    for line in iter(p.stdout.readline, b''):
        print line.strip()[::-1] # print reversed lines
finally:
    p.stdout.close()
    p.wait()

使用POpen将变量发送到Stdin并将Stdout发送到变量

1 个答案: