我有一个python脚本,它使用apply_async生成multipe子进程,该脚本使用管道与主进程进行通信。主进程从子进程接收输入,并进行一些处理。主进程也在kafka使用者循环内运行,因此主进程在处理后不会退出。问题在于,当自然生成子进程时,内存消耗会增加,但是当子进程完成时,主进程内存会增加600 MB。在主进程处理完每条kafka消息之后,整个处理周期后内存一直在增加。是否有一种方法可以构建类似的流而无需使用apply_async以及如何知道主进程中有哪些保留内存。
from multiprocessing import Pool,Pipe,Lock
import multiprocessing
lock = Lock()
def processSomethingInMainProcess(p_output):
while True:
result = p_output.recv()
if type(result) == str:
break
else:
"""Process something computational expensive
"""
print(result)
def childProcessSpawn(p_input,numbers):
"""Child process for Computing something and send the intermediate result to the main process.
Arguments:
p_input {[type]} -- [description]
numbers {[type]} -- [description]
"""
print("Child process called",numbers)
global lock
for i in range(numbers,100):
lock.acquire()
p_input.send(i)
lock.release()
lock.acquire()
p_input.send("end")
lock.release()
def mainProcess():
p_output,p_input = Pipe()
pool = Pool(multiprocessing.cpu_count()-1)
test = [1,2,3,4,5]
for i in test:
pool.apply_async(childProcessSpawn,args=(p_input,i))
processSomethingInMainProcess(p_output)
p_input.close()
p_output.close()
pool.close()
pool.join()
## Works in Kafka Consumer loop.
mainProcess()