为什么mesos-slave要求在重启奴隶上杀死任务超过2小时?

时间:2016-05-12 14:13:48

标签: mesos marathon

为什么mesos-slave要求在重启奴隶上杀死任务超过2小时?

我正在云环境中运行一个具有三个主服务器和四个从服务器的mesos集群。

  • Mesos版本:0.28。
  • 马拉松版:0.15.2

我发现万一,如果我重新启动了一个运行docker任务的slave。重新启动后,该任务将在该从属设备上处于暂存状态超过2小时。在2个小时之后,马拉松可以在其他"奴隶上启动任务。

如果检查日志,我可以看到它坚持"被要求杀死任务"和"忽略杀戮任务"大约2个小时。

有谁知道为什么Mesos需要尝试杀死死亡任务超过2个小时?

重启后记录:

May 11 10:12:18 euca-10-254-234-236 mesos-slave[824]: I0511 10:12:18.199795   964 slave.cpp:1891] Asked to kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
May 11 10:12:18 euca-10-254-234-236 mesos-slave[824]: W0511 10:12:18.199831   964 slave.cpp:2018] Ignoring kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b because the executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 is terminating/terminated
May 11 10:12:18 euca-10-254-234-236 mesos-slave[824]: I0511 10:12:18.199872   964 slave.cpp:1891] Asked to kill task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
2小时后

记录:

I0511 12:15:48.200348   963 slave.cpp:1891] Asked to kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
W0511 12:15:48.200409   963 slave.cpp:2018] Ignoring kill task project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b because the executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 is terminating/terminated
I0511 12:15:48.200429   963 slave.cpp:1891] Asked to kill task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
W0511 12:15:48.200438   963 slave.cpp:2018] Ignoring kill task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b because the executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 is terminating/terminated
I0511 12:15:51.485391   964 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59955 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
I0511 12:15:51.509351   965 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59955 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
W0511 12:15:51.656379   960 slave.cpp:4979] Failed to get resource statistics for executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: b2f5b385-444b-4174-9a1c-8ccd2d3184dc
W0511 12:15:51.656409   960 slave.cpp:4979] Failed to get resource statistics for executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: 1ab25a1b-79fe-430b-9751-330586a1fbef
I0511 12:15:51.663321   965 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59560 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
I0511 12:15:51.671294   965 http.cpp:190] HTTP GET for /slave(1)/state from 10.145.150.124:59560 with User-Agent='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36'
W0511 12:15:52.156903   962 slave.cpp:4979] Failed to get resource statistics for executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: b2f5b385-444b-4174-9a1c-8ccd2d3184dc
W0511 12:15:52.156941   962 slave.cpp:4979] Failed to get resource statistics for executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000: Unknown container: 1ab25a1b-79fe-430b-9751-330586a1fbef
E0511 12:15:52.247448   962 slave.cpp:3773] Container '1ab25a1b-79fe-430b-9751-330586a1fbef' for executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed to start:  future discarded
E0511 12:15:52.247612   962 slave.cpp:3773] Container 'b2f5b385-444b-4174-9a1c-8ccd2d3184dc' for executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed to start:  future discarded
W0511 12:15:52.247642   962 composing.cpp:541] Container '1ab25a1b-79fe-430b-9751-330586a1fbef' is already destroyed
W0511 12:15:52.247660   962 composing.cpp:541] Container 'b2f5b385-444b-4174-9a1c-8ccd2d3184dc' is already destroyed
E0511 12:15:52.247704   962 slave.cpp:3870] Termination of executor 'docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed: Unknown container: 1ab25a1b-79fe-430b-9751-330586a1fbef
I0511 12:15:52.248374   962 slave.cpp:3002] Handling status update TASK_FAILED (UUID: b399e8ce-832c-4b06-a15f-3c155536b872) for task docker-registry.d1c20255-173f-11e6-b66e-d00dacb0c46b of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 from @0.0.0.0:0
E0511 12:15:52.248458   962 slave.cpp:3870] Termination of executor 'project-hub_project-hub-backend.e764cc0d-173f-11e6-b66e-d00dacb0c46b' of framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 failed: Unknown container: b2f5b385-444b-4174-9a1c-8ccd2d3184dc

0 个答案:

没有答案