使用Twisted + Cyclone + PyPy来处理POST请求会导致内存泄漏?

时间:2014-01-11 15:04:23

标签: memory-leaks twisted pypy cyclone

经过大量调查,我发现在服务了数十万个HTTP POST请求之后,就会出现内存泄漏。奇怪的是,只有在使用PyPy时才会发生内存泄漏。

这是一个示例代码:

from twisted.internet import reactor
import tornado.ioloop

do_tornado = False
port = 8888

if do_tornado:
    from tornado.web import RequestHandler, Application
else:
    from cyclone.web import RequestHandler, Application

class MainHandler(RequestHandler):
    def get(self):
        self.write("Hello, world")

    def post(self):
        self.write("Hello, world")

if __name__ == "__main__":
    routes = [(r"/", MainHandler)]
    application = Application(routes)

    print port
    if do_tornado:
        application.listen(port)
        tornado.ioloop.IOLoop.instance().start()
    else:
        reactor.listenTCP(port, application)
        reactor.run()

以下是我用来生成请求的测试代码:

from twisted.internet import reactor, defer
from twisted.internet.task import LoopingCall

from twisted.web.client import Agent, HTTPConnectionPool
from twisted.web.iweb import IBodyProducer

from zope.interface import implements

pool = HTTPConnectionPool(reactor, persistent=True)
pool.retryAutomatically = False
pool.maxPersistentPerHost = 10
agent = Agent(reactor, pool=pool)

bid_url = 'http://localhost:8888'

class StringProducer(object):
    implements(IBodyProducer)

    def __init__(self, body):
        self.body = body
        self.length = len(body)

    def startProducing(self, consumer):
        consumer.write(self.body)
        return defer.succeed(None)

    def pauseProducing(self):
        pass

    def stopProducing(self):
        pass


def callback(a):
    pass

def error_callback(error):
    pass

def loop():
    d = agent.request('POST', bid_url, None, StringProducer("Hello, world"))
    #d = agent.request('GET', bid_url)
    d.addCallback(callback).addErrback(error_callback)


def main():
    exchange = LoopingCall(loop)
    exchange.start(0.02)

    #log.startLogging(sys.stdout)
    reactor.run()

main()

请注意,此代码不会泄漏CPython,也不会泄漏Tornado和Pypy!只有在使用Twisted和Pypy时才会泄漏代码,并且仅在使用POST请求时才会泄漏。

要查看泄漏,您必须发送数十万个请求。

请注意,设置PYPY_GC_MAX时,进程最终会崩溃。

发生了什么事?

1 个答案:

答案 0 :(得分:1)

事实证明泄漏的原因是BytesIO模块。

以下是如何模拟Pypy上的泄漏。

from io import BytesIO
while True: a = BytesIO()

这是修复: https://bitbucket.org/pypy/pypy/commits/40fa4f3a0740e3aac77862fe8a853259c07cb00b