Question

我只是想了解＆＃34;背景＆＃34;在处理subprocess.Popen（）结果和逐行读取时的内存使用方面。这是一个简单的例子。

给出以下脚本test.py，打印＆＃34; Hello＆＃34;然后等待10s并打印＆＃34; world＆＃34;：

import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")

然后，以下脚本test_sub.py将作为子流程调用＆＃39; test.py＆＃39;，将stdout重定向到管道，然后逐行读取：

import subprocess, time, os, sy

cmd = ["python3","test.py"]

p = subprocess.Popen(cmd,
                     stdout=subprocess.PIPE,
                     stderr=subprocess.STDOUT, universal_newlines = True)

for line in iter(p.stdout.readline, ''):
   print("---" + line.rstrip())

在这种情况下，我的问题是，当我执行子进程调用后运行test_sub.py时，它将打印＆＃34; Hello＆＃34;然后等待10秒直到＆＃34; world＆＃34;然后打印出来，发生了什么＆＃34;你好＆＃34;在那10个等待期间？它会在test_sub.py完成之前存储在内存中，还是在第一次迭代中被抛弃？

对于这个例子来说，这可能并不重要，但是在处理非常大的文件时，它确实如此。

Answer 1

＆＃34;你好＆＃34;在那10个等待期间？

"Hello"（在父级中）可通过line名称获得，直到.readline()第二次返回，即"Hello"至少生活直到在父项中读取print("World")的输出。

如果您的意思是在子流程中发生了什么，那么在sys.stdout.flush()之后，"Hello"对象没有理由继续生活，但可能会看到Does Python intern strings?

在test_sub.py完成之前它是否存储在内存中，还是在第一次迭代中被抛弃？

.readline()第二次返回后，line引用"World"。在此之后"Hello"会发生什么情况取决于特定Python实现中的垃圾收集，即使line是"World";对象"Hello"可能会继续存在一段时间。 Releasing memory in Python

您可以设置PYTHONDUMPREFS=1 envvar并使用 debug python版本运行代码，以查看python进程退出时处于活动状态的对象。例如，请考虑以下代码：

#!/usr/bin/env python3
import threading
import time
import sys

def strings():
    yield "hello"
    time.sleep(.5)
    yield "world"
    time.sleep(.5)

def print_line():
    while True:
        time.sleep(.1)
        print('+++', line, file=sys.stderr)

threading.Thread(target=print_line, daemon=True).start()
for line in strings():
    print('---', line)
time.sleep(1)

它表明line在第二个yield之前不会反弹。 PYTHONDUMPREFS=1 ./python . |& grep "'hello'"的输出表示'hello'退出时python仍然有效。

从python中的管道子进程stdout读取行时的内存使用情况

1 个答案: