碳中继吃CPU - EAGAIN(资源暂时不可用)?

时间:2013-11-26 02:44:36

标签: python networking ubuntu-12.04 twisted graphite

  • Python 2.7.3
  • [twisted,version 13.1.0]
  • Xen的domU的

atop表明carbon-relay正在吃80%,90%的USRCPU。来自strace

accept(7, {sa_family=AF_INET, sin_port=htons(60649), sin_addr=inet_addr("192.237.222.81")}, [16]) = 257
accept(7, {sa_family=AF_INET, sin_port=htons(51564), sin_addr=inet_addr("166.78.1.48")}, [16]) = 257
accept(7, 0x7ffff4679550, [16])         = -1 EAGAIN (Resource temporarily unavailable)
accept(7, {sa_family=AF_INET, sin_port=htons(33654), sin_addr=inet_addr("198.61.194.248")}, [16]) = 257
accept(7, {sa_family=AF_INET, sin_port=htons(50037), sin_addr=inet_addr("166.78.181.204")}, [16]) = 257
accept(7, 0x7ffff4679550, [16])         = -1 EAGAIN (Resource temporarily unavailable)

奇怪的是:即使重启服务,每次运行strace时它似乎都停留在fd 7。这是否意味着这个fd没有正确清理?

我增加了打开文件的数量:

/ proc / 2891 / limits

Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             15834                15834                processes 
Max open files            16384                16384                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       15834                15834                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        

然后它降低到~50%。

我的问题看起来与此thread类似,但由于我们有几个处于TIME_WAIT状态的套接字,我认为启用tw_recycle无法提供帮助。关于tcp_syncookies,我在syslog中看不到任何相关消息。

这是我在调试模式下尝试启动carbon-relay时得到的结果:

26/11/2013 02:22:14 :: [listener] MetricPickleReceiver connection with 50.56.249.127:48772 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
26/11/2013 02:22:14 :: [listener] MetricPickleReceiver connection with 198.101.241.101:50672 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.
26/11/2013 02:22:14 :: [listener] MetricPickleReceiver connection with 166.78.2.167:43346 lost: Connection to the other side was lost in a non-clean fashion: Connection lost.

这来自twisted

class ConnectionLost(ConnectionClosed):
    """Connection to the other side was lost in a non-clean fashion"""

    def __str__(self):
        s = self.__doc__.strip().splitlines()[0]
        if self.args:
            s = '%s: %s' % (s, ' '.join(self.args))
        s = '%s.' % s
        return s

我也试过debug with gdb,但pystack没有回复。

0 个答案:

没有答案