Docker容器运行状况检查停止不健康的容器

时间:2020-08-20 13:59:42

标签: docker docker-container health-check

我有一个Docker容器,该容器每1分钟运行一次运行状况检查。我读到在运行状况检查失败后,在dockerfile中向运行状况检查添加“ || kill 1”可以停止容器,但是它似乎对我不起作用,并且我找不到有效的示例。

有人知道标记为不健康后如何停止容器吗?我目前在我的dockerfile中有这个文件:

HEALTHCHECK --start-period=30s --timeout=5s --interval=1m --retries=2 CMD bash /expressvpn/healthcheck.sh || kill 1

编辑1
Dockerfile

FROM debian:buster-slim

ENV CODE="code"
ENV SERVER="smart"

ARG VERSION="expressvpn_2.6.0.32-1_armhf.deb"

COPY files/ /expressvpn/

 RUN apt-get update && apt-get install -y --no-install-recommends \
expect curl ca-certificates iproute2 wget jq \
&& wget -q https://download.expressvpn.xyz/clients/linux/${VERSION} -O /expressvpn/${VERSION} \
&& dpkg -i /expressvpn/${VERSION} \
&& rm -rf /expressvpn/*.deb \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get purge --autoremove -y wget \
&& rm -rf /var/log/*.log

HEALTHCHECK --start-period=30s --timeout=5s --interval=1m --retries=2 CMD bash /expressvpn/healthcheck.sh || exit 1

ENTRYPOINT ["/bin/bash", "/expressvpn/start.sh"]

healthcheck.sh

if [[ ! -z $DDNS ]];
then
    checkIP=$(getent hosts $DDNS | awk '{ print $1 }')
else
    checkIP=$IP
fi

if [[ ! -z $checkIP ]];
then
    ipinfo=$(curl -s -H "Authorization: Bearer $BEARER" 'ipinfo.io' | jq -r '.')
    currentIP=$(jq -r '.ip' <<< "$ipinfo")
    hostname=$(jq -r '.hostname' <<< "$ipinfo")
    if [[ $checkIP = $currentIP ]];
    then
        if [[ ! -z $HEALTHCHECK ]];
        then
            curl https://hc-ping.com/$HEALTHCHECK/fail
            expressvpn disconnect
            expressvpn connect $SERVER
            exit 1
        else
            expressvpn disconnect
            expressvpn connect $SERVER
            exit 1
       fi
    else
        if [[ ! -z $HOSTNAME_PART && ! -z $hostname && $hostname != *"$HOSTNAME_PART"* ]];
        then
            #THIS IS WHERE THE CONTAINER SHOULD STOP <------------
            kill 1
        fi

        if [[ ! -z $HEALTHCHECK ]];
        then
            curl https://hc-ping.com/$HEALTHCHECK
            exit 0
        else
            exit 0
        fi
    fi
else
    exit 0
fi

start.sh

#!/usr/bin/bash
cp /etc/resolv.conf /etc/resolv.conf.bak
umount /etc/resolv.conf
cp /etc/resolv.conf.bak /etc/resolv.conf
rm /etc/resolv.conf.bak
service expressvpn restart
expect /expressvpn/activate.sh
expressvpn connect $SERVER

touch /var/log/temp.log
tail -f /var/log/temp.log

exec "$@"

2 个答案:

答案 0 :(得分:3)

尝试从kill更改为exit 1

HEALTHCHECK --start-period=30s --timeout=5s --interval=1m --retries=2 \
CMD bash /expressvpn/healthcheck.sh || exit 1

Reference from docker docs

编辑1:

经过测试后,如果您想杀死处于unhealthy状态的容器,则需要在运行状况检查脚本/expressvpn/healthcheck.sh中或通过主机上的脚本来执行此操作。

以下示例容器状态为“健康”:

HEALTHCHECK --start-period=30s --timeout=5s --interval=10s --retries=2 CMD bash -c 'echo "0" || kill 1' || exit 1

以下示例中,由于命令ech没有退出,然后执行kill 1并且容器被杀死,容器停止了:

HEALTHCHECK --start-period=30s --timeout=5s --interval=10s --retries=2 CMD bash -c 'ech "0" || kill 1' || exit 1

编辑2:

经过一番挖掘,我了解了我在某些dockerfile中看到的内容:

RUN apt update -y && apt install tini -y

ENTRYPOINT ["tini", "--"]
CMD ["./echo.sh"]

从我得到的docker来看,pid 1 = entrypoint进程不会被SIGTERM杀死,因此,您可以使用小型的util来解决这个问题(仍然不确定到底是什么这样做的目的是下次保存它。.)。
无论如何,在添加Tini之后,该容器被kill 1

杀死了

谢谢你的提问。

答案 1 :(得分:1)

请检查您的健康检查输出。您必须确保您的健康检查实际上连续两次失败。

docker inspect  --format "{{json .State.Health }}" <container name> | jq