Gunicorn工作人员定期崩溃:'套接字未注册'

时间:2014-10-29 15:27:25

标签: python gunicorn

不定时(几个小时一次)gunicorn工作者失败并出现以下错误:

[2014-10-29 10:21:54 +0000] [4902] [INFO] Booting worker with pid: 4902
[2014-10-29 13:15:24 +0000] [4902] [ERROR] Exception in worker process:
Traceback (most recent call last):
  File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 507, in spawn_worker
    worker.init_process()
  File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 109, in init_process
    super(ThreadWorker, self).init_process()
  File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 120, in init_process
    self.run()
  File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 177, in run
    self.murder_keepalived()
  File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 149, in murder_keepalived
    self.poller.unregister(conn.sock)
  File "/opt/test/env/local/lib/python2.7/site-packages/trollius/selectors.py", line 408, in unregister
    key = super(EpollSelector, self).unregister(fileobj)
  File "/opt/test/env/local/lib/python2.7/site-packages/trollius/selectors.py", line 243, in unregister
    raise KeyError("{0!r} is not registered".format(fileobj))
KeyError: '<socket._socketobject object at 0x7f823f454d70> is not registered'
...
...
[2014-10-29 13:15:24 +0000] [4902] [INFO] Worker exiting (pid: 4902)
[2014-10-29 13:15:24 +0000] [5809] [INFO] Booting worker with pid: 5809
 ...

配置:

bind = '0.0.0.0:80'
workers = 1
threads = 4
debug = True
reload = True
daemon = True

我正在使用:

Python 2.7.6
gunicorn==19.1.1
trollius==1.0.2
futures==2.2.0

任何想法可能是什么原因以及如何解决这个问题?

谢谢!

2 个答案:

答案 0 :(得分:0)

我遇到类似的问题,我从枪炮工人那里得到了时间错误。我正在使用同步工作程序,并且具有timeoutkeepalive默认设置。 在我的用例中,我的http请求需要很长时间才能完成,因此同步工作程序已超时。我使用curl作为发送HTTP-1.1请求的http客户端。我将超时时间增加到了一个疯狂的高值3600,即1小时,这是有效的。但是在服务器错误日志中,我看到了与您相同的错误。这是我对这个错误的假设。 因为默认情况下,所有http 1.1请求都是持久性服务器尝试通过将连接重新放入队列但不超过keepalive超时来重用连接。因此,当keepalive超时发生时,它会取消注册套接字,以便它不能被重用并将其关闭。现在,由于我的超时值非常高,服务器会尝试多次取消注册已经未注册的套接字,但是keepalive仍然默认为5秒,因此错误输出。因此,我增加了``keepalive value as well to 3600```。到目前为止它还有效。

# http://gunicorn-docs.readthedocs.org/en/latest/settings.html
timeout = 3600 # one hour timeout for long running jobs
keepalive = 3600

答案 1 :(得分:0)

我在大约一年前就报告了这个枪炮的错误,并且修复应该在gunicorn 19.6.0及更高版本中:https://github.com/benoitc/gunicorn/issues/1258