让主管正确停止芹菜工人

时间:2015-08-04 03:58:03

标签: python django-celery supervisord celery-task django-supervisor

使用芹菜时,我遇到了很多奇怪的事情。比如,我更新tasks.py,supervisorctl reload(重启),但任务错误。有些任务似乎消失了等等 今天我发现因为supervisorctl stop all无法阻止所有芹菜工人。而且只能杀死-9' pgrep python'可以杀死他们。

情况:

    root@ubuntu12:/data/www/article_fetcher# supervisorctl
    celery_beat                      RUNNING    pid 29597, uptime 0:52:18
    celery_worker1                   RUNNING    pid 29556, uptime 0:52:20
    celery_worker2                   RUNNING    pid 29570, uptime 0:52:19
    celery_worker3                   RUNNING    pid 29557, uptime 0:52:20
    celery_worker4                   RUNNING    pid 29586, uptime 0:52:18
    uwsgi                            RUNNING    pid 29604, uptime 0:52:18
    supervisor> stop all
    celery_beat: stopped
    celery_worker2: stopped
    celery_worker4: stopped
    celery_worker3: stopped
    uwsgi: stopped
    celery_worker1: stopped
    supervisor> status
    celery_beat                      STOPPED    Aug 04 11:05 AM
    celery_worker1                   STOPPED    Aug 04 11:05 AM
    celery_worker2                   STOPPED    Aug 04 11:05 AM
    celery_worker3                   STOPPED    Aug 04 11:05 AM
    celery_worker4                   STOPPED    Aug 04 11:05 AM
    uwsgi                            STOPPED    Aug 04 11:05 AM

过程:

root@ubuntu12:~# ps -aux|grep 'python'
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root      8683  0.0  0.1  61420 11768 ?        Ss   Aug03   0:27 /usr/bin/python /usr/bin/supervisord
root     29310  0.1  0.1  57120 11344 pts/2    S+   11:05   0:00 /usr/bin/python /usr/bin/supervisorctl
nobody   29556  2.2  0.5 132484 45988 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody   29557  2.2  0.5 132480 45996 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29570  2.4  0.5 132740 45996 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
nobody   29571 26.9  1.4 217688 115804 ?       R    11:06   0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29572 33.7  0.7 158396 59808 ?        R    11:06   0:12 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29573 29.6  1.4 215176 115928 ?       R    11:06   0:10 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody   29574 27.2  1.4 218244 118180 ?       R    11:06   0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
......
......
......

我发现了这个问题:Stopping Supervisor doesn't stop Celery workers,但它提出了不同的问题,接受的答案supervisorctl stop all实际上并不起作用。所以我决定找到正确的方法。

1 个答案:

答案 0 :(得分:2)

我调查supervisor docs并找到了这个:

  

killasgroup

     

如果为true,则在尝试向程序发送SIGKILL以终止它时   把它发送到整个过程组,照顾它   孩子们也很有用,例如使用Python程序   多处理。

     

默认值:false

     

要求:否。

     

介绍:3.0a11

然后我认为每个工作者创建4个子进程(通过cpu核心)成为一个进程组,这就是为什么supervisorctl stop all不起作用的原因。 所以我将killasgroup添加到supervisord.conf:

    [program:celery_worker1]
    ; Set full path to celery program if using virtualenv

    directory=/data/www/article_fetcher

    command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
    user=nobody
    numprocs=1
    stdout_logfile=/data/www/article_fetcher/logs/celery.log
    stderr_logfile=/data/www/article_fetcher/logs/celery.log
    autostart=true
    autorestart=true
    startsecs=5
    killasgroup=true

    .....
    .....

然后supervisorctl stop all真的停止了芹菜工人!非常好〜