经过大量调查,我发现在服务了数十万个HTTP POST请求之后,就会出现内存泄漏。奇怪的是,只有在使用PyPy时才会发生内存泄漏。
这是一个示例代码:
from twisted.internet import reactor
import tornado.ioloop
do_tornado = False
port = 8888
if do_tornado:
from tornado.web import RequestHandler, Application
else:
from cyclone.web import RequestHandler, Application
class MainHandler(RequestHandler):
def get(self):
self.write("Hello, world")
def post(self):
self.write("Hello, world")
if __name__ == "__main__":
routes = [(r"/", MainHandler)]
application = Application(routes)
print port
if do_tornado:
application.listen(port)
tornado.ioloop.IOLoop.instance().start()
else:
reactor.listenTCP(port, application)
reactor.run()
以下是我用来生成请求的测试代码:
from twisted.internet import reactor, defer
from twisted.internet.task import LoopingCall
from twisted.web.client import Agent, HTTPConnectionPool
from twisted.web.iweb import IBodyProducer
from zope.interface import implements
pool = HTTPConnectionPool(reactor, persistent=True)
pool.retryAutomatically = False
pool.maxPersistentPerHost = 10
agent = Agent(reactor, pool=pool)
bid_url = 'http://localhost:8888'
class StringProducer(object):
implements(IBodyProducer)
def __init__(self, body):
self.body = body
self.length = len(body)
def startProducing(self, consumer):
consumer.write(self.body)
return defer.succeed(None)
def pauseProducing(self):
pass
def stopProducing(self):
pass
def callback(a):
pass
def error_callback(error):
pass
def loop():
d = agent.request('POST', bid_url, None, StringProducer("Hello, world"))
#d = agent.request('GET', bid_url)
d.addCallback(callback).addErrback(error_callback)
def main():
exchange = LoopingCall(loop)
exchange.start(0.02)
#log.startLogging(sys.stdout)
reactor.run()
main()
请注意,此代码不会泄漏CPython,也不会泄漏Tornado和Pypy!只有在使用Twisted和Pypy时才会泄漏代码,并且仅在使用POST请求时才会泄漏。
要查看泄漏,您必须发送数十万个请求。
请注意,设置PYPY_GC_MAX时,进程最终会崩溃。
发生了什么事?
答案 0 :(得分:1)
事实证明泄漏的原因是BytesIO
模块。
以下是如何模拟Pypy上的泄漏。
from io import BytesIO
while True: a = BytesIO()
这是修复: https://bitbucket.org/pypy/pypy/commits/40fa4f3a0740e3aac77862fe8a853259c07cb00b