我正在使用python 2.7.14。可以在OSX和Linux上重现。
我有一些python类:
import cPickle
class TestClass:
def __init__(self, path_to_data=None):
self.loaded_data = None
if path_to_data:
self.load(path_to_data)
def load(self, path_to_data):
self.loaded_data = None
with open(path_to_data, 'r') as f:
self.loaded_data = cPickle.load(f)
你可以制作一个体面的腌制词典:
>>> import cPickle
>>> d = {x:x+1 for x in range(1000000)}
>>> with open('testdict.pkl', 'w+') as f:
>>> cPickle.dump(d, f)
并重复这样的问题:
>>> from test_py import TestClass
>>> import psutil
>>> import os
>>> process = psutil.Process(os.getpid())
>>> process.memory_info()
pmem(rss=8085504L, vms=4405288960L, pfaults=2154, pageins=0)
>>>
>>> t = TestClass('testdict.pkl')
>>> process.memory_info()
pmem(rss=155897856L, vms=4552028160L, pfaults=38241, pageins=0)
>>>
>>> t = TestClass('testdict.pkl')
>>> process.memory_info()
pmem(rss=255520768L, vms=4651646976L, pfaults=62563, pageins=0)
>>>
>>> del t
>>> process.memory_info()
pmem(rss=255520768L, vms=4651646976L, pfaults=62563, pageins=0)
为什么内存没有被垃圾收集?其他东西并不完全相加:sys.getsizeof(t.loaded_data)
仅返回50331928
,但两个负载之间rss
的差异大于此值。这是一个错误,还是我不了解的功能,我该如何避免?
谢谢!
修改
对于那些指出cPickle可能有内存泄漏的人,这里有一个变体:
from marisa_trie import Trie
class TestClass:
def __init__(self, path_to_data=None):
self.loaded_data = None
if path_to_data:
self.load(path_to_data)
def load(self, path_to_data):
self.loaded_data = None
self.loaded_data = Trie().load(path_to_data)
运行脚本
from test_py import TestClass
import psutil
import os
import gc
process = psutil.Process(os.getpid())
print 'empty process:', process.memory_info()
t = TestClass('testtrie.trie')
print 'first load:', process.memory_info()
t = TestClass('testtrie.trie')
print 'second load:', process.memory_info()
gc.collect()
print 'after gc.collect:', process.memory_info()
打印
empty process: pmem(rss=8052736L, vms=4405383168L, pfaults=2158, pageins=134)
first load: pmem(rss=9801728L, vms=4406640640L, pfaults=2585, pageins=158)
second load: pmem(rss=11382784L, vms=4407898112L, pfaults=2971, pageins=158)
after gc.collect: pmem(rss=11382784L, vms=4407898112L, pfaults=2971, pageins=158)
(此处testtrie.trie
的构建如下:
from marisa_trie import Trie
Trie(unicode(x) for x in range(1000000)).save('testtrie.trie')
)