如果失败,Bash脚本将监视进程和sendmail

时间:2014-06-18 15:41:10

标签: bash unix solaris solaris-10

我意识到我无法可靠地依靠ps | grep或变体来准确地告诉我PID启动了什么。但是,在下一个版本中解决此问题之前,我知道我需要进行临时工作。

我有一个名为Foo的进程是父进程,TEST1和TEST2是子进程。如果TEST1和/或TEST2消失,Foo将继续运行,并且不会重新生成正常运行所需的TEST1和/或TEST2。我怎么知道这个,因为重启TEST1和/或TEST2的程序要求首先重启Foo。

因此,当我想监视子进程时,如果失败的sendemail失败,则重新启动该服务并发送另一封电子邮件,以便再次启动它。我计划每隔5分钟通过CRON运行脚本。

检查独立工作,sendmail也是如此。问题是当我创建一个if else语句时。当TEST1或TEST2死亡时,它仍会记录它正在运行的情况。有人可以帮我这个。

#!/bin/bash
#Check if process is running
VAL1=`/usr/ucb/ps aux | grep "[P]ROCESS TEST1" >/dev/null`
VAL2=`/usr/ucb/ps aux | grep "[P]ROCESS TEST2" >/dev/null`
if $VAL1 && $VAL2; then
echo "$(date) - $VAL1 & $VAL2 is Running" >> /var/tmp/Log.txt;
else
SUBJ="Process has stopped"
FROM="Server"
TO="someone@acme.com"
(
cat << !
To : ${TO}
From : ${FROM}
Subject : ${SUBJ}
!
cat << !
The $VAL1 and $VAL2 went down at $(date) please login to the server to restart
!
) | sendmail -v ${TO}
elseif
/usr/sbin/svcadm disable Foo;
wait 10;
/usr/sbin/svcadm enable Foo; 
fi

2 个答案:

答案 0 :(得分:2)

因此,关于测试的一点是你将输出推到/dev/null,这意味着VAL1和VAL2将一直为空。

其次,你不需要elif。你有两个基本条件。事情正在运行,或者它们不是。如果有任何未运行,请发送电子邮件。您可以进行一些额外的测试,以确定它是否为PROCESS TEST1或PROCESS TEST2,但这并非严格必要。

以下是我编写脚本以执行相同操作的方法。

#!/usr/bin/env bash

#Check if process is running
PID1=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST1" | awk '{print $2}')
PID2=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST2" | awk '{print $2}')

err=0

if [ "x$PID1" == "x" ]; then
        # PROCESS TEST1 died
        err=$(( err + 1 ))
else
        echo "$(date) - PROCESS TEST1 $VAL2 is Running" >> /var/tmp/Log.txt;
fi

if [ "x$PID2" == "x" ]; then
        # PROCESS TEST2 died
        err=$(( err + 2 ))
else
        echo "$(date) - PROCESS TEST2  is Running" >> /var/tmp/Log.txt;
fi

if (( $err > 0 )); then
        # identify which PROCESS TEST had the problem.
        if $(( err == 1 )); then
                condition="PROCESS TEST1 is down"
        elif (( $err == 2 )); then
                condition="PROCESS TEST2 is down"
        else
                condition="PROCESS TEST1 and PROCESS TEST2 are down"
        fi

        # let's send an email to get eyes on the issue, but we will restart the process after
        # we send the email.
        SUBJ="Process Error Detected"
        FROM="Server"
        TO="someone@acme.com"
        (
        cat <<-EOT
        To : ${TO}
        From : ${FROM}
        Subject : ${SUBJ}

        $condition at $(date) please login to the server to check that the processes were restarted successfully.

        EOT
        ) | sendmail -v ${TO}

        # we reached an error condition, and we sent mail
        # now let's restart the svc.
        /usr/sbin/svcadm restart Foo
fi

答案 1 :(得分:0)

elseif? 你是说elif?

您是否考虑过使用函数并将sendmail部分放在从if语句中调出的函数中?