Question

如何扩展我的Twisted服务器以处理数以万计的并发SSL套接字连接？

前几百个客户端连接速度相对较快，但随着计数接近3000，它开始爬行，每秒约2个连接。

我使用以下循环进行负载测试：

clients =  []

for i in xrange(connections):
    print i
    clients.append(
        ssl.wrap_socket(
            socket.socket(socket.AF_INET, socket.SOCK_STREAM),
            ca_certs="server.crt",
            cert_reqs=ssl.CERT_REQUIRED
        )
    )

    clients[i].connect(('localhost', 9999))

CPROFILE：

         296644049 function calls (296407530 primitive calls) in 3070.656 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001 3070.656 3070.656 server.py:7(<module>)
        1    0.000    0.000 3070.408 3070.408 server.py:148(main)
        1    0.000    0.000 3070.406 3070.406 server.py:106(run)
        1    0.000    0.000 3070.405 3070.405 base.py:1190(run)
        1    0.047    0.047 3070.404 3070.404 base.py:1195(mainLoop)
    34383    0.090    0.000 3070.263    0.089 epollreactor.py:367(doPoll)
    38696    0.064    0.000 3066.883    0.079 log.py:75(callWithLogger)
    38696    0.077    0.000 3066.797    0.079 log.py:70(callWithContext)
    38696    0.035    0.000 3066.598    0.079 context.py:117(callWithContext)
    38696    0.056    0.000 3066.556    0.079 context.py:61(callWithContext)
    38695    0.093    0.000 3066.486    0.079 posixbase.py:572(_doReadOrWrite)
     8599 1249.585    0.145 3019.333    0.351 protocol.py:114(getClientsDict)
 37582010 1681.445    0.000 1681.445    0.000 {method 'items' of 'dict' objects}
    21496    0.114    0.000 1535.798    0.071 tls.py:346(_flushReceiveBIO)
    21496    0.026    0.000 1535.793    0.071 tcp.py:199(doRead)
    21496    0.017    0.000 1535.718    0.071 tcp.py:218(_dataReceived)
    17197    0.033    0.000 1535.701    0.089 tls.py:400(dataReceived)
     8597    0.009    0.000 1531.480    0.178 policies.py:119(dataReceived)
     8597    0.078    0.000 1531.471    0.178 protocol.py:65(dataReceived)
     4300    0.029    0.000 1525.117    0.355 posixbase.py:242(_disconnectSelectable)
     4300    0.030    0.000 1524.922    0.355 tcp.py:283(connectionLost)
     4300    0.024    0.000 1524.659    0.355 tls.py:463(connectionLost)
     4300    0.010    0.000 1524.492    0.355 policies.py:123(connectionLost)
     4300    0.119    0.000 1524.471    0.355 protocol.py:50(connectionLost)
     4299    0.027    0.000 1523.698    0.354 tcp.py:270(readConnectionLost)
     4299    0.135    0.000 1520.228    0.354 protocol.py:88(handleInitialState)
 74840519   31.487    0.000   44.916    0.000 __init__.py:348(__getattr__)

反应堆运行代码：

def run(self):
    contextFactory = ssl.DefaultOpenSSLContextFactory(self._key, self._cert)
    reactor.listenSSL(self._port, BrakersFactory(), contextFactory)
    reactor.run()

Answer 1

鉴于问题中缺少代码，我将一些人放在一起，看看我是否体验过你所说的效果。从这个实验中，我要说的第一件事是检查并查看脚本运行时机器上的内存利用情况。

我开了一个标准的谷歌云计算系统（1个vCPU，3.8GB ram）（debian backports wheezy，apt-get update; apt-get install python-twisted）并运行以下（糟糕的黑客）代码：

（注意：要运行此操作，我需要为客户端和服务器shell执行ulimit -n 4096，否则我将开始“太多打开文件”I.E. Socket accept - "Too many open files"）

serv.py

#!/usr/bin/python

from twisted.internet import ssl, reactor
from twisted.internet.protocol import ServerFactory, Protocol

class Echo(Protocol):
    def connectionMade(self):
        self.factory.clients.append(self)
        print "Currently %d open connections.\n" % len(self.factory.clients)

    def connectionLost(self, reason):
        self.factory.clients.remove(self)
        print "Lost connection"

    def dataReceived(self, data):
        """As soon as any data is received, write it back."""
        self.transport.write(data)

class MyServerFactory(ServerFactory):
    protocol = Echo

    def __init__(self):
        self.clients = []



if __name__ == '__main__':
    factory = MyServerFactory()
    reactor.listenSSL(8000, factory,
                      ssl.DefaultOpenSSLContextFactory(
            'keys/server.key', 'keys/server.crt'))
    reactor.run()

cli.py

#!/usr/bin/python

from twisted.internet import ssl, reactor
from twisted.internet.protocol import ClientFactory, Protocol

class EchoClient(Protocol):
    def connectionMade(self):
        print "hello, world"
        # The following delay is there because as soon as the write
        # happens the server will close the connection
        reactor.callLater(60, self.transport.write, "hello, world!")

    def dataReceived(self, data):
        print "Server said:", data
        self.transport.loseConnection()

class EchoClientFactory(ClientFactory):
    protocol = EchoClient

    def __init__(self):
        self.stopping = False

    def clientConnectionFailed(self, connector, reason):
        print "Connection failed - reason ",  reason
        if not self.stopping:
              self.stopping = True
              reactor.callLater(10,reactor.stop)

    def clientConnectionLost(self, connector, reason):
        print "Connection lost - goodbye!"
        if not self.stopping:
              self.stopping = True
              reactor.callLater(10,reactor.stop)

if __name__ == '__main__':
    connections = 4000
    factory = EchoClientFactory()
    for i in xrange(connections):
          # the following could certainly be done more elegantly, but I believe
          # its a legit use, and given the list in finite, shouldn't be too
          # resource intensive of a use... ?
          reactor.callLater(i/float(400), reactor.connectSSL,'xx.xx.xx.xx', 8000, factory, ssl.ClientContextFactory())
    reactor.run()

一旦运行，并且越过了2544个连接，我的机器严重堵塞，因此很难从中收集数据，但鉴于新的ssh'es以'/ bin / bash返回：无法分配内存'，以及当我确实得到我的serv.py有2克res，而客户有1.4克时，我认为可以说我吹了ram。

鉴于上面的代码只是一个快速的黑客，我可能有突出的错误导致内存问题 - 虽然我认为我会提供这个想法，因为导致你的机器交换肯定是导致你的应用程序爬行的好方法。（也许你和我有同样的错误）

（顺便说一句，聪明的扭曲的人在那里，我欢迎评论我做错了那就是燃烧了这么多公羊）

Answer 2

我设法确定协议减速的原因。

从上面的cProfile可以看出，大部分的tottime花费在getClientDict（）方法中：

         296644049 function calls (296407530 primitive calls) in 3070.656 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     8599 1249.585    0.145 3019.333    0.351 protocol.py:114(getClientsDict)
 37582010 1681.445    0.000 1681.445    0.000 {method 'items' of 'dict' objects}

以下代码导致此问题：

def getClientsDict(self):
    rc = {1: {}, 2: {}}

    for r in self.factory._clients[1]:
        rc[1] = dict(rc[1].items() +
                                  {r.getDict[1]['id']:
                                       r.getDict[1][
                                           'address']}.items())
    for m in self.factory._clients[2]:
        rc[2] = dict(rc[2].items() +
                                 {m.getDict[2]['id']:
                                      m.getDict[2][
                                          'address']}.items())
    return rc

Twisted SSL套接字连接速度减慢

2 个答案: