我有一个加载8 GB数据然后返回6 MB数据(列表)的函数。 但后来我启动了基于多进程的函数,整个8 GB被传递给所有进程,因为我只需要函数返回的6 MB。因此,内存使用量为8 * process_count。如何删除价值8 GB的不需要的数据。
伪代码:
def fun():
uses another function to load a big json file and get keys from it.
something like this:
return [k for k,v in ext function returning a big dictionary created from big json file]
def main():
list_ll = copy.deepcopy(fun()) # done to remove reference to fun
gc.collect()
launch multiprocess functions each of which takes a part of list.
All 8 GB data is being copied.Memory usage too high.