GQL查询对象内存泄漏

时间:2011-10-17 16:39:42

标签: python google-app-engine memory-leaks gql

我有一个长时间运行的后台进程,可以解析几十万行CSV。我注意到该进程有内存泄漏,偶尔会导致任务达到其软内存限制并终止。我已将代码段缩小到以下代码块:

class BaseModel(db.Model):
    _keyNamespace = 'MyApp.Models'

    @classmethod
    def get_by_item_id(cls, id):
        key = "%s_%d" % (cls._keyNamespace, id)
        item = CacheStrategy.get(key)
        if not item:
            query = cls.gql("WHERE Id = :1", id)
            item = query.get()
            del query

        return item

我已经将其削减到裸骨,但它仍然导致Query对象保留在内存中。示例GC引用转储包含​​在注释的末尾,显示每200个订单批处理步骤后Query和Query_Filter计数增加200。如果我摆脱了查询调用,这当然会消失。

我的问题是, 为什么 是这个泄漏的查询引用,如何让它来尊重del并删除查询引用?

我尝试过这个实例方法(没有区别)。下面的参考计数跟踪:

INFO     2011-10-17 16:29:39,158 orderparser.py:151] Putting a 200 unit batch of orders, 0.335000 seconds from start
DEBUG    2011-10-17 16:29:40,315 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]     356306 Property
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]     356305 PropertyValue
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:40,334 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        217 Query
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]        209 Query_Filter
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:40,335 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:40,336 memleaker.py:22]         18 CompositeIndex
INFO     2011-10-17 16:29:40,644 orderparser.py:151] Putting a 200 unit batch of orders, 1.821000 seconds from start
DEBUG    2011-10-17 16:29:41,930 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]     356506 Property
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]     356505 PropertyValue
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:41,948 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        417 Query
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        409 Query_Filter
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:41,951 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:41,953 memleaker.py:22]         18 CompositeIndex
INFO     2011-10-17 16:29:42,276 orderparser.py:151] Putting a 200 unit batch of orders, 3.450000 seconds from start
DEBUG    2011-10-17 16:29:43,565 memleaker.py:20] Top Mem Leaks
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]     356706 Property
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]     356705 PropertyValue
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      74410 Path
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      74408 Path_Element
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      45127 PropertyValue_ReferenceValue
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      45127 PropertyValue_ReferenceValuePathElement
DEBUG    2011-10-17 16:29:43,585 memleaker.py:22]      43822 Reference
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]      30595 EntityProto
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        617 Query
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        609 Query_Filter
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]        320 ProtocolMessage
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         55 NOT_PROVIDED
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         34 Index_Property
DEBUG    2011-10-17 16:29:43,586 memleaker.py:22]         28 ExtendableProtocolMessage
DEBUG    2011-10-17 16:29:43,588 memleaker.py:22]         18 CompositeIndex

1 个答案:

答案 0 :(得分:1)

我无法使用你的引用代码和下面的一个简单片段(在shell.appspot.com或一个新的应用程序上)重现这个:

from google.appengine.ext import db
import logging
import sys
import types

def get_refcounts():
    d = {}
    # collect all classes
    for m in sys.modules.values():
        for sym in dir(m):
            o = getattr (m, sym)
            if type(o) is types.ClassType:
                d[o] = sys.getrefcount (o)
    # sort by refcount
    pairs = map (lambda x: (x[1],x[0]), d.items())
    pairs.sort()
    pairs.reverse()
    return pairs

def print_top(num = 15):
    print 'Top Mem Leaks'
    for n, c in get_refcounts()[:num]:
        print '%10d %s' % (n, c.__name__)

class TestModel(db.Model):
  id = db.IntegerProperty()


print_top()

q = TestModel.gql("WHERE id = :1", 1)
item = q.get()
del q

print_top()

您的环境中的某些内容似乎可能包含对已执行查询的引用。您使用的是appstats还是其他开发或调试工具?你能创建一个展示你观察到的行为的最小复制案例吗?