我有一个启动Docker容器的Marathon任务。它始终开始,然后在大约10秒后退出。我无法弄清楚它为什么会退出。如果我直接运行完全相同的Docker命令行,容器就会保持运行状态。我很感激任何调试技巧。我附加了Mesos Master,Slave,Kernel&马拉松日志。
Mesos版本: 0.28.1 马拉松: 1.1.1
任务详细信息: cpu 0.5,内存:32 MB,磁盘空间:32 MB
Mesos Slave Logs
I0729 13:27:57.324440 12645 slave.cpp:1361] Got assigned task basic-3.29606d60-5562-11e6-82c4-02010a552889 for framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:57.327953 12644 gc.cpp:83] Unscheduling '/tmp/mesos/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000' from gc
I0729 13:27:57.328632 12644 gc.cpp:83] Unscheduling '/tmp/mesos/meta/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000' from gc
I0729 13:27:57.329421 12645 slave.cpp:1480] Launching task basic-3.29606d60-5562-11e6-82c4-02010a552889 for framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:57.330699 12645 paths.cpp:528] Trying to chown '/tmp/mesos/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000/executors/basic-3.29606d60-5562-11e6-82c4-02010a552889/runs/4e7f7e96-0318-4699-9fb1-8f0a4b3eb0e8' to user 'root'
I0729 13:27:57.339612 12645 slave.cpp:5367] Launching executor basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000/executors/basic-3.29606d60-5562-11e6-82c4-02010a552889/runs/4e7f7e96-0318-4699-9fb1-8f0a4b3eb0e8'
I0729 13:27:57.341789 12645 slave.cpp:1698] Queuing task 'basic-3.29606d60-5562-11e6-82c4-02010a552889' for executor 'basic-3.29606d60-5562-11e6-82c4-02010a552889' of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:57.350003 12652 docker.cpp:1041] Starting container '4e7f7e96-0318-4699-9fb1-8f0a4b3eb0e8' for executor 'basic-3.29606d60-5562-11e6-82c4-02010a552889' and framework '09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000'
E0729 13:27:58.646698 12651 slave.cpp:3773] Container '4e7f7e96-0318-4699-9fb1-8f0a4b3eb0e8' for executor 'basic-3.29606d60-5562-11e6-82c4-02010a552889' of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 failed to start: Container exited on error: exited with status 1
I0729 13:27:58.648016 12645 slave.cpp:3879] Executor 'basic-3.29606d60-5562-11e6-82c4-02010a552889' of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 has terminated with unknown status
I0729 13:27:58.648771 12645 slave.cpp:3002] Handling status update TASK_FAILED (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 from @0.0.0.0:0
W0729 13:27:58.650879 12645 docker.cpp:1302] Ignoring updating unknown container: 4e7f7e96-0318-4699-9fb1-8f0a4b3eb0e8
I0729 13:27:58.652117 12651 status_update_manager.cpp:320] Received status update TASK_FAILED (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.653165 12651 status_update_manager.cpp:824] Checkpointing UPDATE for status update TASK_FAILED (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.658000 12643 slave.cpp:3400] Forwarding the update TASK_FAILED (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 to master@xx.xx.xx.xxx:5050
I0729 13:27:58.751868 12650 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.752244 12650 status_update_manager.cpp:824] Checkpointing ACK for status update TASK_FAILED (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.757303 12652 slave.cpp:3990] Cleaning up executor 'basic-3.29606d60-5562-11e6-82c4-02010a552889' of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.757876 12650 gc.cpp:55] Scheduling '/tmp/mesos/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000/executors/basic-3.29606d60-5562-11e6-82c4-02010a552889/runs/4e7f7e96-0318-4699-9fb1-8f0a4b3eb0e8' for gc 6.99999123038222days in the future
I0729 13:27:58.758280 12650 gc.cpp:55] Scheduling '/tmp/mesos/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000/executors/basic-3.29606d60-5562-11e6-82c4-02010a552889' for gc 6.99999122602074days in the future
I0729 13:27:58.758572 12650 gc.cpp:55] Scheduling '/tmp/mesos/meta/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000/executors/basic-3.29606d60-5562-11e6-82c4-02010a552889/runs/4e7f7e96-0318-4699-9fb1-8f0a4b3eb0e8' for gc 6.99999122381926days in the future
I0729 13:27:58.758597 12652 slave.cpp:4078] Cleaning up framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.758901 12650 gc.cpp:55] Scheduling '/tmp/mesos/meta/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000/executors/basic-3.29606d60-5562-11e6-82c4-02010a552889' for gc 6.99999122165333days in the future
I0729 13:27:58.759013 12646 status_update_manager.cpp:282] Closing status update streams for framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.759237 12650 gc.cpp:55] Scheduling '/tmp/mesos/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000' for gc 6.99999121670222days in the future
I0729 13:27:58.759496 12650 gc.cpp:55] Scheduling '/tmp/mesos/meta/slaves/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7/frameworks/09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000' for gc 6.99999121438518days in the future
马拉松日志
[2016-07-29 13:27:56,985] INFO Received offers WANTED notification (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:56,986] INFO => revive offers NOW, canceling any scheduled revives (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:56,986] INFO 2 further revives still needed. Repeating reviveOffers according to --revive_offers_repetitions 3 (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:56,986] INFO => Schedule next revive at 2016-07-29T07:58:01.983Z in 4999 milliseconds, adhering to --min_revive_offers_interval 5000 (ms) (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:57,191] INFO Request Launch for task 'basic-3.29606d60-5562-11e6-82c4-02010a552889', version '2016-07-29T07:57:56.450Z'. 1 tasksToLaunch, 0 in flight, 0 confirmed. not backing off (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:57,193] INFO No tasks left to launch. Stop receiving offers for /basic-3, 2016-07-29T07:57:56.450Z (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:57,194] INFO removing matcher ActorOfferMatcher(Actor[akka://marathon/user/launchQueue/2/0-basic-3#-1947978475]) (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:57,194] INFO Received offers NOT WANTED notification, canceling 2 revives (mesosphere.marathon.core.flow.impl.ReviveOffersActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:57,221] INFO Processing LaunchEphemeral(LaunchedEphemeral(task [basic-3.29606d60-5562-11e6-82c4-02010a552889],AgentInfo(xx.xx.xx.xxx,Some(09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7),Buffer()),2016-07-29T07:57:56.450Z,Status(2016-07-29T07:57:57.163Z,None,None),Vector(31623))) for task [basic-3.29606d60-5562-11e6-82c4-02010a552889] (mesosphere.marathon.core.launcher.impl.OfferProcessorImpl:ForkJoinPool-2-worker-41)
[2016-07-29 13:27:57,221] INFO Finished processing b2501f80-c821-4b1a-9f67-c008739a4e06-O1. Matched 1 ops after 2 passes. cpus(*) 11.5; mem(*) 23106.0; disk(*) 4942.0; ports(*) 31000->31622,31624->32000 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:57,293] INFO receiveTaskUpdate: updating status of task [basic-3.29606d60-5562-11e6-82c4-02010a552889] (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-19)
[2016-07-29 13:27:57,312] INFO Task launch for 'task [basic-3.29606d60-5562-11e6-82c4-02010a552889]' was accepted. 0 tasksToLaunch, 0 in flight, 1 confirmed. not backing off (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:58,667] INFO Received status update for task basic-3.29606d60-5562-11e6-82c4-02010a552889: TASK_FAILED (Failed to launch container: Container exited on error: exited with status 1) (mesosphere.marathon.MarathonScheduler$$EnhancerByGuice$$9250fa1f:Thread-37)
[2016-07-29 13:27:58,723] INFO Removed app [/basic-3] from tracker (mesosphere.marathon.core.task.tracker.TaskTracker$TasksByApp$:marathon-akka.actor.default-dispatcher-3)
[2016-07-29 13:27:58,735] INFO receiveTaskUpdate: task [basic-3.29606d60-5562-11e6-82c4-02010a552889] finished (mesosphere.marathon.core.launchqueue.impl.AppTaskLauncherActor:marathon-akka.actor.default-dispatcher-3)
[2016-07-29 13:27:58,739] INFO Sending event notification for task [basic-3.29606d60-5562-11e6-82c4-02010a552889] of app [/basic-3]: TASK_FAILED (mesosphere.marathon.core.task.update.impl.steps.PostToEventStreamStepImpl$$EnhancerByGuice$$42a76347:marathon-akka.actor.default-dispatcher-3)
[2016-07-29 13:27:58,740] INFO Increasing delay. Task launch delay for [/basic-3] changed from [0 milliseconds] to [1 seconds]. (mesosphere.marathon.core.launchqueue.impl.RateLimiter$:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:58,743] INFO initiating a scale check for app [/basic-3] after task [basic-3.29606d60-5562-11e6-82c4-02010a552889] terminated (mesosphere.marathon.core.task.update.impl.steps.ScaleAppUpdateStepImpl$$EnhancerByGuice$$78bdbf3a:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:58,743] INFO schedulerActor: Actor[akka://marathon/user/MarathonScheduler#1328539013] (mesosphere.marathon.core.task.update.impl.steps.ScaleAppUpdateStepImpl$$EnhancerByGuice$$78bdbf3a:marathon-akka.actor.default-dispatcher-8)
[2016-07-29 13:27:58,747] WARN New task [task [basic-3.29606d60-5562-11e6-82c4-02010a552889]] failed during app /basic-3 scaling, queueing another task (mesosphere.marathon.upgrade.TaskStartActor:marathon-akka.actor.default-dispatcher-8)
Mesos主日志
I0729 13:27:56.987308 20184 master.cpp:3720] Processing REVIVE call for framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (marathon) at scheduler-88ef4a70-96ae-4294-a286-ba0a955b193a@xx.xx.xx.xxx:54704
I0729 13:27:56.987633 20184 hierarchical.cpp:988] Removed offer filters for framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:56.989745 20176 master.cpp:5324] Sending 1 offers to framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (marathon) at scheduler-88ef4a70-96ae-4294-a286-ba0a955b193a@xx.xx.xx.xxx:54704
I0729 13:27:57.314573 20183 master.cpp:3104] Processing ACCEPT call for offers: [ b2501f80-c821-4b1a-9f67-c008739a4e06-O1 ] on slave 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7 at slave(1)@xx.xx.xx.xxx:5051 (xx.xx.xx.xxx) for framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (marathon) at scheduler-88ef4a70-96ae-4294-a286-ba0a955b193a@xx.xx.xx.xxx:54704
I0729 13:27:57.321236 20180 master.hpp:177] Adding task basic-3.29606d60-5562-11e6-82c4-02010a552889 with resources cpus(*):0.5; mem(*):32; disk(*):32; ports(*):[31623-31623] on slave 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7 (xx.xx.xx.xxx)
I0729 13:27:57.321815 20180 master.cpp:3589] Launching task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (marathon) at scheduler-88ef4a70-96ae-4294-a286-ba0a955b193a@xx.xx.xx.xxx:54704 with resources cpus(*):0.5; mem(*):32; disk(*):32; ports(*):[31623-31623] on slave 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7 at slave(1)@xx.xx.xx.xxx:5051 (xx.xx.xx.xxx)
I0729 13:27:57.554368 20177 master.cpp:5324] Sending 1 offers to framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (marathon) at scheduler-88ef4a70-96ae-4294-a286-ba0a955b193a@xx.xx.xx.xxx:54704
I0729 13:27:57.561040 20181 master.cpp:3641] Processing DECLINE call for offers: [ b2501f80-c821-4b1a-9f67-c008739a4e06-O2 ] for framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (marathon) at scheduler-88ef4a70-96ae-4294-a286-ba0a955b193a@xx.xx.xx.xxx:54704
W0729 13:27:58.653374 20181 master.cpp:4859] Ignoring unknown exited executor 'basic-3.29606d60-5562-11e6-82c4-02010a552889' of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 on slave 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7 at slave(1)@xx.xx.xx.xxx:5051 (xx.xx.xx.xxx)
I0729 13:27:58.660586 20181 master.cpp:4763] Status update TASK_FAILED (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 from slave 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7 at slave(1)@xx.xx.xx.xxx:5051 (xx.xx.xx.xxx)
I0729 13:27:58.660701 20181 master.cpp:4811] Forwarding status update TASK_FAILED (UUID: 7cb5dc80-8c90-4933-83a1-c68e4a1369bf) for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000
I0729 13:27:58.661497 20181 master.cpp:6421] Updating the state of task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (latest state: TASK_FAILED, status update state: TASK_FAILED)
I0729 13:27:58.749291 20181 master.cpp:3918] Processing ACKNOWLEDGE call 7cb5dc80-8c90-4933-83a1-c68e4a1369bf for task basic-3.29606d60-5562-11e6-82c4-02010a552889 of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 (marathon) at scheduler-88ef4a70-96ae-4294-a286-ba0a955b193a@xx.xx.xx.xxx:54704 on slave 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7
I0729 13:27:58.749658 20181 master.cpp:6487] Removing task basic-3.29606d60-5562-11e6-82c4-02010a552889 with resources cpus(*):0.5; mem(*):32; disk(*):32; ports(*):[31623-31623] of framework 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-0000 on slave 09894fa6-e1fa-4d26-aaca-0ea6c8fb06da-S7 at slave(1)
Mesos Slave Kernel日志:
Jul 29 13:28:07 test-lab-test-infra-reserved-4001279 kernel: [175212.110000] IPv6: ADDRCONF(NETDEV_UP): vethe4b2f1c: link is not ready
Jul 29 13:28:07 test-lab-test-infra-reserved-4001279 kernel: [175212.110012] docker0: port 1(vethe4b2f1c) entered forwarding state
Jul 29 13:28:07 test-lab-test-infra-reserved-4001279 kernel: [175212.110028] docker0: port 1(vethe4b2f1c) entered forwarding state
Jul 29 13:28:07 test-lab-test-infra-reserved-4001279 kernel: [175212.111114] docker0: port 1(vethe4b2f1c) entered disabled state
Jul 29 13:28:07 test-lab-test-infra-reserved-4001279 kernel: [175212.505747] IPv6: ADDRCONF(NETDEV_CHANGE): vethe4b2f1c: link becomes ready
Jul 29 13:28:07 test-lab-test-infra-reserved-4001279 kernel: [175212.505810] docker0: port 1(vethe4b2f1c) entered forwarding state
Jul 29 13:28:07 test-lab-test-infra-reserved-4001279 kernel: [175212.505824] docker0: port 1(vethe4b2f1c) entered forwarding state
Jul 29 13:28:08 test-lab-test-infra-reserved-4001279 kernel: [175212.979072] docker0: port 1(vethe4b2f1c) entered disabled state
Jul 29 13:28:08 test-lab-test-infra-reserved-4001279 kernel: [175212.980744] device vethe4b2f1c left promiscuous mode
Jul 29 13:28:08 test-lab-test-infra-reserved-4001279 kernel: [175212.980783] docker0: port 1(vethe4b2f1c) entered disabled state
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.043575] aufs au_opts_verify:1570:docker[8836]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.081090] aufs au_opts_verify:1570:docker[8836]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.128271] aufs au_opts_verify:1570:docker[8836]: dirperm1 breaks the protection by the permission bits on the lower branch
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.149756] device vethb70e796 entered promiscuous mode
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.150008] IPv6: ADDRCONF(NETDEV_UP): vethb70e796: link is not ready
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.576723] IPv6: ADDRCONF(NETDEV_CHANGE): vethb70e796: link becomes ready
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.576805] docker0: port 1(vethb70e796) entered forwarding state
Jul 29 13:28:12 test-lab-test-infra-reserved-4001279 kernel: [175217.576819] docker0: port 1(vethb70e796) entered forwarding state
Jul 29 13:28:13 test-lab-test-infra-reserved-4001279 kernel: [175218.107877] docker0: port 1(vethb70e796) entered disabled state
Jul 29 13:28:13 test-lab-test-infra-reserved-4001279 kernel: [175218.109620] device vethb70e796 left promiscuous mode
Jul 29 13:28:13 test-lab-test-infra-reserved-4001279 kernel: [175218.109662] docker0: port 1(vethb70e796) entered disabled state