我正在尝试使用Uwsgi和supervisor在运行Debian 8.1的计算机上部署Django应用程序。
当我通过sudo systemctl restart supervisor
重新启动时,它无法重启一半时间。
$ root@host:/# systemctl start supervisor
Job for supervisor.service failed. See 'systemctl status supervisor.service' and 'journalctl -xn' for details.
$ root@host:/# systemctl status supervisor.service
● supervisor.service - LSB: Start/stop supervisor
Loaded: loaded (/etc/init.d/supervisor)
Active: failed (Result: exit-code) since Wed 2015-09-23 11:12:01 UTC; 16s ago
Process: 21505 ExecStop=/etc/init.d/supervisor stop (code=exited, status=0/SUCCESS)
Process: 21511 ExecStart=/etc/init.d/supervisor start (code=exited, status=1/FAILURE)
Sep 23 11:12:01 host supervisor[21511]: Starting supervisor:
Sep 23 11:12:01 host systemd[1]: supervisor.service: control process exited, code=exited status=1
Sep 23 11:12:01 host systemd[1]: Failed to start LSB: Start/stop supervisor.
Sep 23 11:12:01 host systemd[1]: Unit supervisor.service entered failed state.
但是主管或uwsgi日志中没有任何内容。 Supervisor 3.0正在使用uwsgi的这种配置运行:
[program:uwsgi]
stopsignal=QUIT
command = uwsgi --ini uwsgi.ini
directory = /dir/
environment=ENVIRONMENT=STAGING
logfile-maxbytes = 300MB
stopsignal = QUIT已被添加,因为UWSGI在停止时忽略默认信号(SIGTERM)并且在SIGKILL离开孤儿工作人员时被残忍地杀死。
有没有办法可以调查发生了什么?
修改
尝试为mnencia建议:/etc/init.d/supervisor stop && while /etc/init.d/supervisor status ; do sleep 1; done && /etc/init.d/supervisor start
但是有一半时间它仍然失败。
root@host:~# /etc/init.d/supervisor stop && while /etc/init.d/supervisor status ; do sleep 1; done && /etc/init.d/supervisor start
[ ok ] Stopping supervisor (via systemctl): supervisor.service.
● supervisor.service - LSB: Start/stop supervisor
Loaded: loaded (/etc/init.d/supervisor)
Active: inactive (dead) since Tue 2015-11-24 13:04:32 UTC; 89ms ago
Process: 23490 ExecStop=/etc/init.d/supervisor stop (code=exited, status=0/SUCCESS)
Process: 23349 ExecStart=/etc/init.d/supervisor start (code=exited, status=0/SUCCESS)
Nov 24 13:04:30 xxx supervisor[23349]: Starting supervisor: supervisord.
Nov 24 13:04:30 xxx systemd[1]: Started LSB: Start/stop supervisor.
Nov 24 13:04:32 xxx systemd[1]: Stopping LSB: Start/stop supervisor...
Nov 24 13:04:32 xxx supervisor[23490]: Stopping supervisor: supervisord.
Nov 24 13:04:32 xxx systemd[1]: Stopped LSB: Start/stop supervisor.
[....] Starting supervisor (via systemctl): supervisor.serviceJob for supervisor.service failed. See 'systemctl status supervisor.service' and 'journalctl -xn' for details.
failed!
root@host:~# /etc/init.d/supervisor stop && while /etc/init.d/supervisor status ; do sleep 1; done && /etc/init.d/supervisor start
[ ok ] Stopping supervisor (via systemctl): supervisor.service.
● supervisor.service - LSB: Start/stop supervisor
Loaded: loaded (/etc/init.d/supervisor)
Active: failed (Result: exit-code) since Tue 2015-11-24 13:04:32 UTC; 1s ago
Process: 23490 ExecStop=/etc/init.d/supervisor stop (code=exited, status=0/SUCCESS)
Process: 23526 ExecStart=/etc/init.d/supervisor start (code=exited, status=1/FAILURE)
Nov 24 13:04:32 xxx systemd[1]: supervisor.service: control process exited, code=exited status=1
Nov 24 13:04:32 xxx systemd[1]: Failed to start LSB: Start/stop supervisor.
Nov 24 13:04:32 xxx systemd[1]: Unit supervisor.service entered failed state.
Nov 24 13:04:32 xxx supervisor[23526]: Starting supervisor:
Nov 24 13:04:33 xxx systemd[1]: Stopped LSB: Start/stop supervisor.
[ ok ] Starting supervisor (via systemctl): supervisor.service.
答案 0 :(得分:19)
这不一定是主管的错误。我从systemctl status
输出中看到supervisor
是通过sysv-init兼容层启动的,因此失败可能在/etc/init.d/supervisor
脚本中。它可以解释监督日志中没有错误。
要调试init脚本,最简单的方法是在该文件中添加set -x
作为第一个非注释指令,并在journalctl
输出中查看脚本执行的跟踪。
修改强>
我已经在Debian Sid的测试系统上复制并调试了它。
问题是超级用户init-script的 stop 目标不检查守护进程是否真正终止,而是仅在进程存在时发送信号。如果守护程序进程需要一段时间才能关闭,则后续的 start 操作将因为正在运行的死亡守护程序进程而失败。
我在Debian Bug Tracker上打开了一个错误:http://bugs.debian.org/805920
解决方法:强>
您可以使用以下方法解决问题:
/etc/init.d/supervisor force-stop && \
/etc/init.d/supervisor stop && \
/etc/init.d/supervisor start
force-stop
将确保supervisord已被终止(在systemd之外)。stop
确保systemd知道它已被终止start
再次启动它 stop
之后的force-stop
是必需的,否则systemd将忽略任何后续start
请求。可以使用stop
合并start
和restart
,但在此我已将两者都用于展示其工作原理。
答案 1 :(得分:0)
我在ubuntu 14.04中遇到过这个问题,尝试了debian和@mnencia解决方案的最新initd脚本,但它们并没有为我工作。强制停止解决方案没有杀死他们只是在监督被杀后继续运行的程序进程。
我的解决方案是修补supervisord并启动并重新启动部分initd脚本代码我不想猜测一个好的DODTIME,我希望它在旧的主管主进程终止时立即启动,所以我添加了一个重试逻辑。请注意,它有点冗长,但如果您不喜欢该行为,则可以删除回音调用,并且可以更改最大值(此处设置为20)。
start)
echo -n "Starting $DESC: "
i=1
until [ $i -ge 21 ]; do
start-stop-daemon --start --quiet --pidfile $PIDFILE --startas $DAEMON -- $DAEMON_OPTS && break
echo -n -e "\nAlready running, old process still finishing? retrying ($i/20)..."
let "i += 1"
sleep 1
done
sleep 1
if running ; then
echo "$NAME."
else
echo " ERROR."
fi
;;
restart)
echo -n "Restarting $DESC: "
start-stop-daemon --stop --quiet --oknodo --pidfile $PIDFILE
i=1
until [ $i -ge 21 ]; do
start-stop-daemon --start --quiet --pidfile $PIDFILE --startas $DAEMON -- $DAEMON_OPTS && break
echo -n -e "\nAlready running, old process still finishing? retrying ($i/20)..."
let "i += 1"
sleep 1
done
echo "$NAME."
;;
我也改变了hashbang(第一行)所以bash是用sh的insted,我想用let
#! /bin/bash