amazon-ecs-agent每隔2~3分钟停止我的应用程序

时间:2018-04-09 14:51:19

标签: amazon-web-services amazon-ecs

摘要

ECS代理显然忽略了我的ECS_CONTAINER_STOP_TIMEOUT配置为1h。

描述

我有一个容器需要一些时间来完成他的任务,而且由于我在验证时间,我已经将ECS_CONTAINER_STOP_TIMEOUT变量设置为1h,以避免代理对我的应用程序造成任何影响。但是每隔2~3分钟,代理人仍停止申请。

在我的观点上,代理应该等待1小时配置才能尝试停止我的容器,对吗?

代理日志(Luigi和Model是我的应用程序):

2018-04-09T11:44:06Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: sending task change event [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC]
2018-04-09T11:44:06Z [INFO] TaskHandler: batching container event: arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f model -> RUNNING, Known Sent: NONE
2018-04-09T11:44:06Z [INFO] TaskHandler: Adding event: TaskChange: [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f luigi -> RUNNING, Ports [{8082 8080 0.0.0.0 0}], Known Sent: NONE, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f model -> RUNNING, Known Sent: NONE] sent: false
2018-04-09T11:44:06Z [INFO] TaskHandler: Sending task change: TaskChange: [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f luigi -> RUNNING, Ports [{8082 8080 0.0.0.0 0}], Known Sent: NONE, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f model -> RUNNING, Known Sent: NONE] sent: false
2018-04-09T11:44:06Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: sent task change event [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC]
2018-04-09T11:44:06Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: redundant container state change. model to RUNNING, but already RUNNING
2018-04-09T11:44:14Z [INFO] Saving state! module="statemanager"
2018-04-09T11:46:42Z [INFO] Saving state! module="statemanager"
2018-04-09T11:46:42Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: Cgroup resource set up for task complete
2018-04-09T11:46:42Z [INFO] Task engine [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: stopping container [luigi]
2018-04-09T11:46:42Z [INFO] Task engine [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: stopping container [model]
2018-04-09T11:46:43Z [WARN] Error converting stats for container 823686bc5bcb5172a9f3d3fd6c0f4fd2a0fea924870f990f6a74475e6b840674: Invalid container statistics reported, no cpu core usage reported
2018-04-09T11:46:43Z [INFO] Task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: recording execution stopped time. Essential container [luigi] stopped at: 2018-04-09 11:46:43.485611165 +0000 UTC m=+786.881116035

我手动:更改ecs.confg并重新启动ECS代理,以应用新配置。

1 个答案:

答案 0 :(得分:0)

aws-agent将SIGTERM发送到容器,应用程序必须处理SIGTERM以避免容器关闭。发送SIGTERM后,aws-agent将等待ECS_CONTAINER_STOP_TIMEOUT上配置的时间,然后将SIGKILL发送到容器。

SIGTERM处理示例:

#!/bin/bash

exit_script() {
    echo "SIGTERM captured..."
    echo "Cleanning..."
    trap - SIGINT SIGTERM # clear the trap
}
trap 'exit_script' SIGINT SIGTERM

while true
do
  echo "Waiting for the SIGTERM..."
  sleep 3
done

Credits for the question to richardpen.