麻省理工学院的Starcluster错误,节点太多了200多个

时间:2016-08-19 14:52:20

标签: amazon-ec2 starcluster

是否有人出现过大于200个节点的群集大小的问题?每当我尝试时,我都会收到以下错误:

7/dist-packages/boto/ec2/connection.py", line 585, in get_all_instances
   max_results=max_results)
 File "/usr/local/lib/python2.7/dist-packages/boto/ec2/connection.py", line 681, in get_all_reservations
   [('item', Reservation)], verb='POST')
 File "/usr/local/lib/python2.7/dist-packages/boto/connection.py", line 1186, in get_list
   raise self.ResponseError(response.status, response.reason, body)
EC2ResponseError: EC2ResponseError: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>FilterLimitExceeded</Code><Message>The maximum number of filter values specified on a single call is 200</Message></Error></Errors><RequestID>290b6e93-22b4-4450-b487-64d9174d166e</RequestID></Response>

我正在使用starcluster开发分支 0.95.6 ,因为它支持更新的c4 ec2实例。

Bellow是我的星团配置,其中XXXX代替私人信息:

####################################
## StarCluster Configuration File ##
####################################
[global]

DEFAULT_TEMPLATE=cluster

#############################################
## AWS Credentials and Connection Settings ##
#############################################
[aws info]

AWS_ACCESS_KEY_ID = XXXXXXX
AWS_SECRET_ACCESS_KEY = XXXXXXXX
AWS_USER_ID=XXXXXX

AWS_REGION_NAME = eu-east-1


###########################
## Defining EC2 Keypairs ##
###########################

[key mykey]
KEY_LOCATION= XXXXXX


################################
## Defining Cluster Templates ##
################################


[cluster cluster]
KEYNAME = mykey
CLUSTER_SIZE = 400
CLUSTER_USER = sgeadmin

CLUSTER_SHELL = bash

NODE_IMAGE_ID = ami-52a0c53b

NODE_INSTANCE_TYPE = c4.large

AVAILABILITY_ZONE = us-east-1a

VOLUMES = cluster, datastore

PERMISSIONS = ssh

SPOT_BID = 0.07



#############################
## Configuring EBS Volumes ##
#############################

[volume cluster]
VOLUME_ID = xxxxx
MOUNT_PATH = /home

[volume datastore]
VOLUME_ID = xxxxx
MOUNT_PATH = /data/


############################################
## Configuring Security Group Permissions ##
############################################

[permission ssh]
IP_PROTOCOL = tcp
FROM_PORT = 22
TO_PORT = 22

1 个答案:

答案 0 :(得分:1)

这看起来像是由AWS api中的限制引起的StarCluster错误。 StarCluster发送一个包含200多个项目的过滤列表,并被AWS拒绝。应该制作补丁,以便StarCluster发送多个请求。