我有一个小的方便的类,我在我的代码中使用了很多,如下所示:
class Structure(dict):
def __init__(self, **kwargs):
dict.__init__(self, **kwargs)
self.__dict__ = self
关于它的好处是你可以使用字典键语法或通常的对象样式来访问属性:
myStructure = Structure(name="My Structure")
print myStructure["name"]
print myStructure.name
今天我注意到我的应用程序内存消耗在我预期会减少的情况下略有增加。在我看来,从结构类生成的实例不会被收集。为了说明这一点,这是一个小片段:
import gc
class Structure(dict):
def __init__(self, **kwargs):
dict.__init__(self, **kwargs)
self.__dict__ = self
structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])
使用以下输出:
Structure name: __16
Structure name: __16
Structures count: 4096
正如您注意到结构实例计数仍为4096。
我评论了创建方便的自我引用的行:
import gc
class Structure(dict):
def __init__(self, **kwargs):
dict.__init__(self, **kwargs)
# self.__dict__ = self
structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
# print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])
现在删除了循环引用,输出才有意义:
Structure name: __16
Structures count: 0
我使用Melia进一步推动测试以分析内存消耗:
import gc
import pprint
from meliae import scanner
from meliae import loader
class Structure(dict):
def __init__(self, **kwargs):
dict.__init__(self, **kwargs)
self.__dict__ = self
structures = [Structure(name="__{0}".format(str(value))) for value in range(4096)]
print "Structure name: ", structures[16].name
print "Structure name: ", structures[16]["name"]
del structures
gc.collect()
print "Structures count: ", len([obj for obj in gc.get_objects() if type(obj) is Structure])
scanner.dump_all_objects("Test_001.json")
om = loader.load("Test_001.json")
summary = om.summarize()
print summary
structures = om.get_all("Structure")
if structures:
pprint.pprint(structures[0].c)
生成以下输出:
Structure name: __16
Structure name: __16
Structures count: 4096
loading... line 5001, 5002 objs, 0.6 / 1.8 MiB read in 0.2s
loading... line 10002, 10003 objs, 1.1 / 1.8 MiB read in 0.3s
loading... line 15003, 15004 objs, 1.7 / 1.8 MiB read in 0.5s
loaded line 16405, 16406 objs, 1.8 / 1.8 MiB read in 0.5s
checked 1 / 16406 collapsed 0
checked 16405 / 16406 collapsed 157
compute parents 0 / 16249
compute parents 16248 / 16249
set parents 16248 / 16249
collapsed in 0.2s
Total 16249 objects, 58 types, Total size = 3.2MiB (3306183 bytes)
Index Count % Size % Cum Max Kind
0 4096 25 1212416 36 36 296 Structure
1 390 2 536976 16 52 49432 dict
2 5135 31 417550 12 65 12479 str
3 82 0 290976 8 74 12624 module
4 235 1 212440 6 80 904 type
5 947 5 121216 3 84 128 code
6 1008 6 120960 3 88 120 function
7 1048 6 83840 2 90 80 wrapper_descriptor
8 654 4 47088 1 92 72 builtin_function_or_method
9 562 3 40464 1 93 72 method_descriptor
10 517 3 37008 1 94 216 tuple
11 139 0 35832 1 95 2280 set
12 351 2 30888 0 96 88 weakref
13 186 1 23200 0 97 1664 list
14 63 0 21672 0 97 344 WeakSet
15 21 0 18984 0 98 904 ABCMeta
16 197 1 14184 0 98 72 member_descriptor
17 188 1 13536 0 99 72 getset_descriptor
18 284 1 6816 0 99 24 int
19 14 0 5296 0 99 2280 frozenset
[Structure(4312707312 296B 2refs 2par),
type(4298634592 904B 4refs 100par 'Structure')]
内存使用量为3.2MiB,删除自引用行会导致以下输出:
Structure name: __16
Structures count: 0
loading... line 5001, 5002 objs, 0.6 / 1.4 MiB read in 0.1s
loading... line 10002, 10003 objs, 1.1 / 1.4 MiB read in 0.3s
loaded line 12308, 12309 objs, 1.4 / 1.4 MiB read in 0.4s
checked 12 / 12309 collapsed 0
checked 12308 / 12309 collapsed 157
compute parents 0 / 12152
compute parents 12151 / 12152
set parents 12151 / 12152
collapsed in 0.1s
Total 12152 objects, 57 types, Total size = 2.0MiB (2093714 bytes)
Index Count % Size % Cum Max Kind
0 390 3 536976 25 25 49432 dict
1 5134 42 417497 19 45 12479 str
2 82 0 290976 13 59 12624 module
3 235 1 212440 10 69 904 type
4 947 7 121216 5 75 128 code
5 1008 8 120960 5 81 120 function
6 1048 8 83840 4 85 80 wrapper_descriptor
7 654 5 47088 2 87 72 builtin_function_or_method
8 562 4 40464 1 89 72 method_descriptor
9 517 4 37008 1 91 216 tuple
10 139 1 35832 1 92 2280 set
11 351 2 30888 1 94 88 weakref
12 186 1 23200 1 95 1664 list
13 63 0 21672 1 96 344 WeakSet
14 21 0 18984 0 97 904 ABCMeta
15 197 1 14184 0 98 72 member_descriptor
16 188 1 13536 0 98 72 getset_descriptor
17 284 2 6816 0 99 24 int
18 14 0 5296 0 99 2280 frozenset
19 22 0 2288 0 99 104 classobj
确认结构实例已被破坏且内存使用量降至2.0MiB。
知道我怎么能确保这个类得到正确的垃圾收集?所有这些都是顺便在Python 2.7.2(达尔文)上执行的。
干杯,
托马斯
答案 0 :(得分:3)
您可以使用__getattr__
和__setattr__
更直接地实现您的Structure类,以允许属性访问转到底层字典。
class Structure(dict):
def __getattr__(self, k):
return self[k]
def __setattr__(self, k, v):
self[k] = v
Cycles 是垃圾收集在Python中,但只是定期(不像常规引用计数的对象,一旦引用计数降到0就会被收集)。
避免循环(使用__getattr__
和__setattr__
作为Structure类),意味着您将获得更好的gc行为。您可能希望将collections.namedtuple
看作是一个不错的选择:它并没有完全按照您的实施方式进行,但也许它适合您的目的。