Question

我在运行Python 2.6的Linux机器上运行多个命令，可能需要一些时间。

因此，我使用subprocess.Popen类和process.communicate()方法来并行执行多个命令组，并在执行后立即捕获输出。

def run_commands(commands, print_lock):
    # this part runs in parallel.
    outputs = []
    for command in commands:
        proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
        output, unused_err = proc.communicate()  # buffers the output
        retcode = proc.poll()                    # ensures subprocess termination
        outputs.append(output)
    with print_lock: # print them at once (synchronized)
        for output in outputs:
            for line in output.splitlines():
                print(line)

在其他地方，它被称为：

processes = []
print_lock = Lock()
for ...:
    commands = ...  # a group of commands is generated, which takes some time.
    processes.append(Thread(target=run_commands, args=(commands, print_lock)))
    processes[-1].start()
for p in processes: p.join()
print('done.')

预期的结果是一组命令的每个输出一次显示，而它们的执行是并行完成的。

但是从第二个输出组（当然，由于调度不确定性而成为第二个输出的线程），它开始打印而没有换行并添加与前一行和输入中打印的字符数一样多的空格echo被关闭 - 终端状态是“乱码”或“崩溃”。（如果我发出reset shell命令，则恢复正常。）

起初，我试图找到处理'\r'的原因，但这不是原因。正如您在我的代码中看到的那样，我使用splitlines()正确处理了它，并且我确认将repr()函数应用于输出。

我认为原因是在Popen和communicate()中为stdout / stderr同时使用管道。我在Python 2.7中尝试了check_output快捷方法，但没有成功。当然，如果我序列化所有命令执行和打印，则不会出现上述问题。

有没有更好的方法可以并行处理Popen和communicate()？

Answer 1

最终结果的灵感来自J.F.Sebastian的评论。

http://bitbucket.org/daybreaker/kaist-cs443/src/247f9ecf3cee/tools/manage.py

这似乎是一个Python错误。

Answer 2

我不确定run_commands需要实际执行什么，但它似乎只是在子进程上进行轮询，忽略返回代码并继续循环。当你到达打印输出的部分时，你怎么知道子过程已经完成了？

Answer 3

在您的示例代码中，我注意到您使用了：

for line in output.splitlines():

部分解决“ / r ”的问题;使用

for line in output.splitlines(True):

会有所帮助。

更好的多线程使用Python subprocess.Popen＆amp;通信（）？

3 个答案: