Monit不清除pid文件并在进程变为僵尸时重新启动进程

时间:2016-09-13 17:28:45

标签: docker monit zombie-process

我在一个docker容器中运行monit,它正在监视一堆进程,比如vault,nginx,mongodb等等。我已经为每个具有启动停止功能的进程创建了包装脚本,并将其提供给

#!/bin/sh
# vault service script

VAULT_DIR="/tmp/vault"
VAULT_USER="myuser"
USER=$(whoami)
if [ $USER != "root" ]
then
     echo "Only root can run vault-server service"
     exit 1
fi


usage() {
     echo "Usage: `basename $0`: <start|stop|status|restart>"
     exit 1 
}

start() {
     status
     if [ $PID -gt 0 ]
     then
        echo "vault server daemon was already started. PID: $PID"
        return $PID
     fi
     echo "Starting vault server daemon..."
     rm -f /var/run/vault.pid
     VAULT_OPTIONS=""
     VAULT_OPTIONS="-dev"
     su $VAULT_USER -c "/usr/bin/nohup vault server $VAULT_OPTIONS 1>/var/log/vault/vault.log 2>/var/log/vault/vault.err &"
     status
     if [ $PID -gt 0 ]
     then
        echo $PID >> /var/run/vault.pid
     fi
     sleep 5
     su $VAULT_USER /opt/vault/setup-vault.sh
}

stop() {

     status
     if [ $PID -eq 0 ]
     then
        echo "vault server daemon is already not running"
        return 0
     fi
     echo "Stopping vault server daemon..."
     rm -f /var/run/vault.pid
     kill $PID
 }
status() {                                                               
     PID=`ps -ef | grep "vault server" | grep -v grep | grep -v "\[" | awk '{print $1}'`                                                  
     if [ "x$PID" = "x" ]                                     
     then                                                                                                                  
        PID=0                                                       
     fi                                                                                                                    

     # if PID is greater than 0 then vault server is running, else it is not                                               
     return $PID                                                         
}                                                                              

if [ "x$1" = "xstart" ]                                                        
then                                                                                                                          
  start                                                                  
  exit 0                                                                 
fi                                                                                                                            

if [ "x$1" = "xstop" ]                                                                                                        
then                                                                                                                          
  stop                                                                   
  exit 0                                                                  
fi                                                                             

if [ "x$1" = "xrestart" ]                                                      
then                                                                           
  stop                                                     
  start                                                                  
  exit 0                               
fi                                                                             

if [ "x$1" = "xstatus" ]                                                       
then                                          
   status                                                                 
   if [ $PID -gt 0 ]                                        
   then                                                                   
      echo "vault server daemon is running with PID: $PID"
   else                                                                   
      echo "vault server daemon is NOT running"                   
   fi                                                                     
   exit $PID                                                           
fi                                                                             

usage  

由于某些原因,当进程崩溃并变成僵尸时,monit不会清除pid文件并重新启动进程。另外,为了验证并且没有在我的状态函数中捕获僵尸进程,我在grep -v "\["语句中添加了ps -ef子句。还有什么我需要做的,或者之前是否有人遇到过这个问题?

1 个答案:

答案 0 :(得分:1)

如果您的应用程序正在生成僵尸,请将tini添加到您的堆栈中。你的入口点/ cmd变成了调用现有入口点的tini,而tini将处理僵尸收割。

这是僵尸进程没有通过主机的init进程传递命名空间容器jail的结果。所以你需要一个可以收集僵尸的命名空间的pid 1。