以下是我正在使用的代码的一个非常简单的示例...
from concurrent.futures import ProcessPoolExecutor
import pandas
if __name__ == "__main__":
def i_use_lots_of_memory():
print 'doing something that uses a lot of memory'
data = pandas.read_csv('large_txt_file.txt')
del data
# do other things here as soon as I've solved mem usage issues
print 'ha ha I used up a ton of memory.'
def simplest_callback_ever(future):
_ = future.result()
print 'callback was run'
class ManagesFileReading(object):
def __init__(self):
self.pool = ProcessPoolExecutor(max_workers=24)
def add_job(self, callback=None):
future = self.pool.submit(i_use_lots_of_memory)
if callback:
future.add_done_callback(callback)
mfr = ManagesFileReading()
mfr.add_job(simplest_callback_ever)
在这个例子中,我打开一个800MB的文本文件,占用大约2GB的内存。输出是......
doing something that uses a lot of memory
ha ha I used up a ton of memory.
callback function was run. Task is complete.
所以任务完成,问题是内存永远不会释放。即使未来已经完成,它也永远不会释放内存。我可以释放它的唯一方法是通过运行self._pool.shutdown()
来关闭进程池除非我误解了ProcessPoolExecutor是如何工作的,否则当回调函数完成时意味着任务已经完成,对吧?为什么未来被删除以及内存被释放?有什么想法吗?