我有一个长时间运行的后台进程,可以解析几十万行CSV。我注意到该进程有内存泄漏,偶尔会导致任务达到其软内存限制并终止。我已将代码段缩小到以下代码块:
class BaseModel(db.Model):
_keyNamespace = 'MyApp.Models'
@classmethod
def get_by_item_id(cls, id):
key = "%s_%d" % (cls._keyNamespace, id)
item = CacheStrategy.get(key)
if not item:
query = cls.gql("WHERE Id = :1", id)
item = query.get()
del query
return item
我已经将其削减到裸骨,但它仍然导致Query对象保留在内存中。示例GC引用转储包含在注释的末尾,显示每200个订单批处理步骤后Query和Query_Filter计数增加200。如果我摆脱了查询调用,这当然会消失。
我的问题是, 为什么 是这个泄漏的查询引用,如何让它来尊重del并删除查询引用?
我尝试过这个实例方法(没有区别)。下面的参考计数跟踪:
INFO 2011-10-17 16:29:39,158 orderparser.py:151] Putting a 200 unit batch of orders, 0.335000 seconds from start
DEBUG 2011-10-17 16:29:40,315 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 356306 Property
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 356305 PropertyValue
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 217 Query
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 209 Query_Filter
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:40,336 memleaker.py:22] 18 CompositeIndex
INFO 2011-10-17 16:29:40,644 orderparser.py:151] Putting a 200 unit batch of orders, 1.821000 seconds from start
DEBUG 2011-10-17 16:29:41,930 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 356506 Property
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 356505 PropertyValue
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 417 Query
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 409 Query_Filter
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:41,953 memleaker.py:22] 18 CompositeIndex
INFO 2011-10-17 16:29:42,276 orderparser.py:151] Putting a 200 unit batch of orders, 3.450000 seconds from start
DEBUG 2011-10-17 16:29:43,565 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 356706 Property
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 356705 PropertyValue
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 617 Query
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 609 Query_Filter
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:43,588 memleaker.py:22] 18 CompositeIndex
答案 0 :(得分:1)
我无法使用你的引用代码和下面的一个简单片段(在shell.appspot.com或一个新的应用程序上)重现这个:
from google.appengine.ext import db
import logging
import sys
import types
def get_refcounts():
d = {}
# collect all classes
for m in sys.modules.values():
for sym in dir(m):
o = getattr (m, sym)
if type(o) is types.ClassType:
d[o] = sys.getrefcount (o)
# sort by refcount
pairs = map (lambda x: (x[1],x[0]), d.items())
pairs.sort()
pairs.reverse()
return pairs
def print_top(num = 15):
print 'Top Mem Leaks'
for n, c in get_refcounts()[:num]:
print '%10d %s' % (n, c.__name__)
class TestModel(db.Model):
id = db.IntegerProperty()
print_top()
q = TestModel.gql("WHERE id = :1", 1)
item = q.get()
del q
print_top()
您的环境中的某些内容似乎可能包含对已执行查询的引用。您使用的是appstats还是其他开发或调试工具?你能创建一个展示你观察到的行为的最小复制案例吗?