我有一个3节点设置运行Marathon,mesos-master,mesos-slave和Zookeeper并启用了HA配置,然后使用mesos-execute测试了一个简单的hello app的部署,并且它按预期工作。
现在一切都很好,所以我连接到Marathon并部署一个简单的应用程序来测试马拉松:( echo“hello”&gt;&gt;&tmp / output.txt)但应用程序被吸入“等待”状态。< / p>
阻止Marathon使用mesos资源进行部署可能会出现什么问题?
来自mesos-master的日志:
I0904 11:23:27.064332 19769 master.cpp:2813] Received SUBSCRIBE call for framework 'marathon' at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:27.064623 19769 master.cpp:2890] Subscribing framework marathon with checkpointing enabled and capabilities [ PARTITION_AWARE ]
I0904 11:23:27.064669 19769 master.cpp:6272] Updating info for framework cb16118a-2257-4020-a907-63aa6294e11b-0000
I0904 11:23:27.064697 19769 master.cpp:2994] Framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 failed over
I0904 11:23:27.065032 19770 hierarchical.cpp:342] Activated framework cb16118a-2257-4020-a907-63aa6294e11b-0000
I0904 11:23:27.065465 19770 master.cpp:7305] Sending 3 offers to framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:27.907865 19769 http.cpp:1115] HTTP GET for /files/read?_=1504517007920&jsonp=jQuery17109098185077823333_1504516979864&length=50000&offset=352538&path=%2Fmaster%2Flog from 192.168.40.1:53525 with User-Agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'
I0904 11:23:28.916651 19768 http.cpp:1115] HTTP GET for /files/read?_=1504517008930&jsonp=jQuery17109098185077823333_1504516979865&length=50000&offset=353797&path=%2Fmaster%2Flog from 192.168.40.1:53525 with User-Agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'
E0904 11:23:30.071293 19775 process.cpp:2450] Failed to shutdown socket with fd 39, address 192.168.40.159:58072: Transport endpoint is not connected
I0904 11:23:30.073277 19768 master.cpp:1430] Framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 disconnected
I0904 11:23:30.073307 19768 master.cpp:3160] Deactivating framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:30.073485 19768 master.cpp:3137] Disconnecting framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:30.073496 19768 master.cpp:1445] Giving framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 1weeks to failover
I0904 11:23:30.073519 19768 hierarchical.cpp:374] Deactivated framework cb16118a-2257-4020-a907-63aa6294e11b-0000
curl -XGET'http://mesosphere2:8098/v2/queue?pretty'| JQ
{
"queue": [
{
"count": 1,
"delay": {
"timeLeftSeconds": 0,
"overdue": true
},
"since": "2017-09-04T13:12:42.024Z",
"processedOffersSummary": {
"processedOffersCount": 12,
"unusedOffersCount": 12,
"lastUnusedOfferAt": "2017-09-04T13:14:52.554Z",
"rejectSummaryLastOffers": [
{
"reason": "UnfulfilledRole",
"declined": 3,
"processed": 3
},
{
"reason": "UnfulfilledConstraint",
"declined": 0,
"processed": 0
},
{
"reason": "NoCorrespondingReservationFound",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientCpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientMemory",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientDisk",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientGpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientPorts",
"declined": 0,
"processed": 0
}
],
"rejectSummaryLaunchAttempt": [
{
"reason": "UnfulfilledRole",
"declined": 12,
"processed": 12
},
{
"reason": "UnfulfilledConstraint",
"declined": 0,
"processed": 0
},
{
"reason": "NoCorrespondingReservationFound",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientCpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientMemory",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientDisk",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientGpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientPorts",
"declined": 0,
"processed": 0
}
]
},
"app": {
"id": "/test03",
"acceptedResourceRoles": [
"slave_public"
],
"backoffFactor": 1.15,
"backoffSeconds": 1,
"container": {
"type": "DOCKER",
"docker": {
"forcePullImage": false,
"image": "laghao/hello-marathon",
"network": "BRIDGE",
"parameters": [],
"portMappings": [
{
"containerPort": 80,
"hostPort": 80,
"labels": {},
"protocol": "tcp",
"servicePort": 10003
}
],
"privileged": false
},
"volumes": []
},
"cpus": 0.1,
"disk": 0,
"executor": "",
"instances": 1,
"labels": {},
"maxLaunchDelaySeconds": 3600,
"mem": 64,
"gpus": 0,
"portDefinitions": [
{
"port": 10003,
"name": "default",
"protocol": "tcp"
}
],
"requirePorts": false,
"upgradeStrategy": {
"maximumOverCapacity": 1,
"minimumHealthCapacity": 1
},
"version": "2017-09-04T13:12:41.993Z",
"versionInfo": {
"lastScalingAt": "2017-09-04T13:12:41.993Z",
"lastConfigChangeAt": "2017-09-04T13:12:41.993Z"
},
"killSelection": "YOUNGEST_FIRST",
"unreachableStrategy": {
"inactiveAfterSeconds": 300,
"expungeAfterSeconds": 600
}
}
}
]
}
答案 0 :(得分:0)
应用程序永远处于“等待”状态 这意味着Marathon不会从Mesos接收“资源优惠”,允许它启动此应用程序的任务。最简单的失败是集群中没有足够的可用资源,或者其他框架都没有所有这些资源。您可以在Mesos UI中查看可用资源。请注意,必须在单个主机上提供所需的资源(例如CPU,内存,磁盘)。
如果您没有自己找到解决方案并且您创建了GitHub问题,请将Mesos / state端点的输出附加到错误报告中,以便我们可以检查可用的群集资源。
在您的情况下,应用程序角色要求和代理角色存在问题。您可以从UnfulfilledRole
推断出它。
Marathon 1.4引入了有关卡住部署的信息。您可以查询/v2/queue
并获取有关拒绝优惠的统计信息。