我在python3中加载大型DataFrame并获取它的一小部分。
我希望Python从内存中删除原始的大型数据帧对象,包括其名称,引用和值。虽然,这种情况并没有发生,因为记忆并没有减少。为什么?
这给了我一个很大的问题,我不知道如何释放内存。
before 130564096
loading files
just before taking subset 3827941376
7258946128
56
after 3803156480
这是代码:
from pympler.asizeof import asizeof
process = psutil.Process(os.getpid())
basepath = os.getcwd()
print("before", process.memory_info().rss)
def load_files(file_name, file_ext):
filename = "%s.%s" %(file_name, file_ext)
filepath = path.abspath(path.join(basepath, "..", "..", "data_input", filename))
with open(filepath, 'rb') as pickle_load:
df = pickle.load(pickle_load)
print("just before taking subset", process.memory_info().rss)
print(asizeof(df))
df2 = df[:100].copy(deep=True)
del df
gc.collect()
df = pd.DataFrame()
df = ''
gc.collect()
print(asizeof(df))
print("after", process.memory_info().rss)
exit()
return df2