在构建正常但在cloudformation中更新任务时,我在部署ECS群集时遇到问题。 ECSSerivce启动了6个PENDING
新任务。但是仍然有6个旧任务RUNNING
,有时它将开始耗尽旧任务,并且部署将工作,但是有时所有旧任务都不会耗尽,而ECSService只会停留在UPDATE_IN_PROGRESS
中。我该如何困扰这样的事情?
下面是我的堆栈模板。
AWSTemplateFormatVersion: '2010-09-09'
Resources:
ElasticLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
SecurityGroups:
- !Ref 'ELBSecurityGroup'
Subnets:
- !Ref 'InstanceSubnet'
- !Ref 'SecondarySubnet'
Scheme: internet-facing
RedirectLoadBalancerListener:
Type: AWS::ElasticLoadBalancingV2::Listener
DependsOn: ECSServiceRole
Properties:
DefaultActions:
- Type: forward
TargetGroupArn: !Ref 'ECSTG'
LoadBalancerArn: !Ref 'ElasticLoadBalancer'
Port: '80'
Protocol: HTTP
RedirectLoadBalancerListenerRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
DependsOn: RedirectLoadBalancerListener
Properties:
Actions:
- Type: forward
TargetGroupArn: !Ref 'ECSTG'
Conditions:
- Field: path-pattern
Values:
- /
ListenerArn: !Ref 'RedirectLoadBalancerListener'
Priority: '1'
LoadBalancerListener:
Type: AWS::ElasticLoadBalancingV2::Listener
DependsOn: ECSServiceRole
Properties:
Certificates:
- CertificateArn: !Ref 'SSLCertificateId'
DefaultActions:
- Type: forward
TargetGroupArn: !Ref 'ECSTG'
LoadBalancerArn: !Ref 'ElasticLoadBalancer'
Port: '443'
Protocol: HTTPS
LoadBalancerListenerRule:
Type: AWS::ElasticLoadBalancingV2::ListenerRule
DependsOn: LoadBalancerListener
Properties:
Actions:
- Type: forward
TargetGroupArn: !Ref 'ECSTG'
Conditions:
- Field: path-pattern
Values:
- /
ListenerArn: !Ref 'LoadBalancerListener'
Priority: '1'
ECSTG:
DependsOn: ElasticLoadBalancer
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
HealthCheckIntervalSeconds: 6
HealthCheckPath: /api/ping
HealthCheckProtocol: HTTP
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 2
Port: 80
Protocol: HTTP
UnhealthyThresholdCount: 5
VpcId: !Ref 'VPCId'
TargetGroupAttributes:
- Key: deregistration_delay.timeout_seconds
Value: '20'
AppSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: AppSecurityGroup
SecurityGroupIngress:
- IpProtocol: '-1'
FromPort: '-1'
ToPort: '-1'
SourceSecurityGroupId: !Ref 'ELBSecurityGroup'
VpcId: !Ref 'VPCId'
Route53Entry:
Type: AWS::Route53::RecordSetGroup
Properties:
HostedZoneName: !Join ['', [!Ref 'Route53HostedZone', .]]
Comment: Zone apex alias targeted to myELB LoadBalancer.
RecordSets:
- Name: !Join [., [!Ref 'ApplicationHost', !Ref 'Route53HostedZone']]
Type: A
AliasTarget:
HostedZoneId: !GetAtt [ElasticLoadBalancer, CanonicalHostedZoneID]
DNSName: !GetAtt [ElasticLoadBalancer, DNSName]
ELBSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: ELBSecurityGroup
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: '443'
ToPort: '443'
CidrIp: 0.0.0.0/0
- IpProtocol: tcp
FromPort: '80'
ToPort: '80'
CidrIp: 0.0.0.0/0
VpcId: !Ref 'VPCId'
CloudWatchAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
ActionsEnabled: true
AlarmActions:
- arn:aws:sns:us-east-1:6xxxxxxx:instance-alarm
ComparisonOperator: LessThanOrEqualToThreshold
Dimensions:
- Name: LoadBalancer
Value: !GetAtt [ElasticLoadBalancer, LoadBalancerFullName]
- Name: TargetGroup
Value: !GetAtt [ECSTG, TargetGroupFullName]
EvaluationPeriods: 5
MetricName: HealthyHostCount
Namespace: AWS/ApplicationELB
Period: 60
Statistic: Maximum
Threshold: 0
LowOnCreditAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
ActionsEnabled: true
AlarmActions:
- arn:aws:sns:us-east-1:6xxxxxx:instance-alarm
ComparisonOperator: LessThanThreshold
Dimensions:
- Name: AutoScalingGroupName
Value: !Ref 'AutoScalingGroup'
EvaluationPeriods: 1
MetricName: CPUCreditBalance
Namespace: AWS/EC2
Period: 300
Statistic: Average
Threshold: 15
Database:
Type: AWS::RDS::DBInstance
Properties:
AllocatedStorage: '5'
DBInstanceClass: db.t2.micro
Engine: postgres
BackupRetentionPeriod: 35
EngineVersion: 9.5.2
DBName: !If [RestoreDB, '', ekdb]
MasterUsername: !Ref 'DBUser'
MasterUserPassword: !Ref 'DBPassword'
DBSecurityGroups:
- !Ref 'DatabaseSecurityGroup'
DBSubnetGroupName: !Ref 'DatabaseSubnetGroup'
DBSnapshotIdentifier: !Ref 'DBSnapshot'
DeletionPolicy: Snapshot
DatabaseSecurityGroup:
Type: AWS::RDS::DBSecurityGroup
Properties:
GroupDescription: DatabaseSecurityGroup
DBSecurityGroupIngress:
- EC2SecurityGroupId: !Ref 'AppSecurityGroup'
EC2VpcId: !Ref 'VPCId'
Redis:
Type: AWS::ElastiCache::CacheCluster
Properties:
CacheNodeType: cache.t2.micro
Engine: redis
EngineVersion: 2.8.24
NumCacheNodes: 1
VpcSecurityGroupIds:
- !Ref 'RedisSecurityGroup'
CacheSubnetGroupName: !Ref 'RedisSubnetGroup'
RedisSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: RedisSecurityGroup
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: '6379'
ToPort: '6379'
SourceSecurityGroupId: !Ref 'AppSecurityGroup'
VpcId: !Ref 'VPCId'
FrontendUser:
Type: AWS::IAM::User
Properties:
Groups:
- SynapseAppUsers
BackendUser:
Type: AWS::IAM::User
Properties:
Groups:
- SynapseAppUsers
FrontendUserAccessKey:
Type: AWS::IAM::AccessKey
Properties:
UserName: !Ref 'FrontendUser'
BackendUserAccessKey:
Type: AWS::IAM::AccessKey
Properties:
UserName: !Ref 'BackendUser'
S3BucketPolicy:
Type: AWS::S3::BucketPolicy
Properties:
Bucket: !Ref 'S3Bucket'
PolicyDocument:
Statement:
- Action: s3:GetObject
Effect: Allow
Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
Principal:
AWS:
- !GetAtt 'FrontendUser.Arn'
- !GetAtt 'BackendUser.Arn'
- Action: s3:PutObject
Effect: Allow
Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
Principal:
AWS:
- !GetAtt 'BackendUser.Arn'
- Action: s3:PutObjectAcl
Effect: Allow
Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
Principal:
AWS:
- !GetAtt 'BackendUser.Arn'
- Action:
- s3:PutObjectAcl
- s3:PutObject
- s3:GetObject
- s3:DeleteObject
Effect: Allow
Resource: !Sub 'arn:aws:s3:::${S3Bucket}/*'
Principal:
AWS:
- arn:aws:iam::6xxxxxxx:user/filestack-v3-policy
S3Bucket:
Type: AWS::S3::Bucket
Properties:
AccessControl: AuthenticatedRead
CorsConfiguration:
CorsRules:
- AllowedHeaders:
- '*'
AllowedMethods:
- GET
- PUT
- POST
AllowedOrigins:
- '*'
ExposedHeaders:
- ETag
MaxAge: 3000
DeletionPolicy: Retain
AppIamRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service:
- ec2.amazonaws.com
Action:
- sts:AssumeRole
Path: /
Policies:
- PolicyName: app-iam-role
PolicyDocument:
Statement:
- Effect: Allow
Action:
- ecs:*
- ecr:*
- sns:*
- logs:*
Resource: '*'
- Effect: Allow
Action:
- s3:PutObject
- s3:GetObject
- s3:PutObjectAcl
- s3:DeleteObject
Resource: !GetAtt [S3Bucket, Arn]
AppInstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Path: /
Roles:
- !Ref 'AppIamRole'
LaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
AssociatePublicIpAddress: true
ImageId: !FindInMap [AWSRegionToAMI, !Ref 'AWS::Region', AMIID]
InstanceType: !If [IsExclusive, t2.medium, m4.large]
IamInstanceProfile: !Ref 'AppInstanceProfile'
SecurityGroups:
- !Ref 'AppSecurityGroup'
UserData: !Base64
Fn::Join:
- ''
- - '#!/bin/bash -xe
'
- echo ECS_CLUSTER=
- !Ref 'ECSCluster'
- ' >> /etc/ecs/ecs.config
'
- 'yum install -y aws-cfn-bootstrap
'
- '/opt/aws/bin/cfn-signal -e $? '
- ' --stack '
- !Ref 'AWS::StackName'
- ' --resource AutoScalingGroup '
- ' --region '
- !Ref 'AWS::Region'
AutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
LaunchConfigurationName: !Ref 'LaunchConfig'
MinSize: 1
MaxSize: 2
DesiredCapacity: !If [IsExclusive, 1, 2]
VPCZoneIdentifier:
- !Ref 'InstanceSubnet'
HealthCheckGracePeriod: 600
HealthCheckType: ELB
CreationPolicy:
ResourceSignal:
Timeout: PT15M
UpdatePolicy:
AutoScalingReplacingUpdate:
WillReplace: 'true'
DatabaseSubnetGroup:
Type: AWS::RDS::DBSubnetGroup
Properties:
DBSubnetGroupDescription: Subnet Group for database
SubnetIds:
- !Ref 'SecondarySubnet'
- !Ref 'InstanceSubnet'
RedisSubnetGroup:
Type: AWS::ElastiCache::SubnetGroup
Properties:
Description: Subnet Group for Redis
SubnetIds:
- !Ref 'SecondarySubnet'
- !Ref 'InstanceSubnet'
ECSCluster:
Type: AWS::ECS::Cluster
ECSService:
DependsOn:
- RedirectLoadBalancerListener
- LoadBalancerListener
- AutoScalingGroup
Type: AWS::ECS::Service
Properties:
Cluster: !Ref 'ECSCluster'
DesiredCount: !If [IsExclusive, 2, 6]
Role: !Ref 'ECSServiceRole'
TaskDefinition: !Ref 'TaskDefinition'
LoadBalancers:
- ContainerName: nginx
ContainerPort: '80'
TargetGroupArn: !Ref 'ECSTG'
ECSServiceRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service:
- ecs.amazonaws.com
Action:
- sts:AssumeRole
Path: /
Policies:
- PolicyName: ecs-service
PolicyDocument:
Statement:
- Effect: Allow
Action:
- elasticloadbalancing:DeregisterInstancesFromLoadBalancer
- elasticloadbalancing:DeregisterTargets
- elasticloadbalancing:Describe*
- elasticloadbalancing:RegisterInstancesWithLoadBalancer
- elasticloadbalancing:RegisterTargets
- ec2:Describe*
- ec2:AuthorizeSecurityGroupIngress
Resource: '*'
TaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
ContainerDefinitions:
- Name: frontend
Memory: '256'
MemoryReservation: '32'
Image: !Sub '6xxxxxxx0.dkr.ecr.us-east-1.amazonaws.com/frontend:${ImageTag}'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref 'ECSLogGroup'
awslogs-region: !Ref 'AWS::Region'
awslogs-stream-prefix: '[frontend]'
- Name: backend
Memory: '1024'
MemoryReservation: '256'
Links:
- xray-daemon
Environment:
- Name: NODE_ENV
Value: prod
- Name: AWS_XRAY_DAEMON_ADDRESS
Value: "xray-daemon:2000"
- Name: APPLICATION_URL
Value: !Sub 'https://${ApplicationHost}.${Route53HostedZone}'
- Name: ACCOUNTS_TOKEN
Value: !Ref AccountsToken
- Name: ACCOUNTS_URL
Value: !Ref 'AccountsUrl'
- Name: HEAP_APPLICATION_ID
Value: '3901275559'
- Name: HUBSPOT_API_KEY
Value: !Ref 'HubspotApiKey'
- Name: USER_POOL
Value: !Ref 'UserPool'
- Name: POOL_CLIENTS
Value: !Ref 'PoolClients'
- Name: JWKS
Value: !Ref 'JWKS'
- Name: DATABASE_URL
Value: !Sub ['postgresql://${DBUser}:${DBPassword}@${Address}:${Port}/ekdb',
{Address: !GetAtt [Database, Endpoint.Address], Port: !GetAtt [Database,
Endpoint.Port]}]
- Name: REDIS_URL
Value: !Sub ['redis://${Address}:${Port}/', {Address: !GetAtt [Redis, RedisEndpoint.Address],
Port: !GetAtt [Redis, RedisEndpoint.Port]}]
- Name: S3_FRONTEND_USER_ACCESS_KEY_ID
Value: !Ref 'FrontendUserAccessKey'
- Name: S3_FRONTEND_USER_SECRET
Value: !GetAtt [FrontendUserAccessKey, SecretAccessKey]
- Name: S3_BACKEND_USER_ACCESS_KEY_ID
Value: !Ref 'BackendUserAccessKey'
- Name: S3_BACKEND_USER_SECRET
Value: !GetAtt [BackendUserAccessKey, SecretAccessKey]
- Name: S3_BUCKET_NAME
Value: !Ref 'S3Bucket'
- Name: UPLOAD_STRATEGY
Value: S3
- Name: ACCOUNT_ID
Value: !Ref 'AccountId'
- Name: CHECK_ACCOUNT_ID
Value: !Ref 'CheckAccountId'
- Name: SNS_TOPIC_ARN
Value: !Ref 'SNSTopicArn'
Image: !Sub '6xxxxxx.dkr.ecr.us-east-1.amazonaws.com/backend:${ImageTag}'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref 'ECSLogGroup'
awslogs-region: !Ref 'AWS::Region'
awslogs-stream-prefix: '[backend]'
- Name: nginx
Memory: '256'
MemoryReservation: '32'
Links:
- frontend
- backend
- pdf_viewer
- preview
Image: !Sub '67xxxxxx.dkr.ecr.us-east-1.amazonaws.com/nginx:${ImageTag}'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref 'ECSLogGroup'
awslogs-region: !Ref 'AWS::Region'
awslogs-stream-prefix: '[nginx]'
PortMappings:
- ContainerPort: 80
- Name: pdf_viewer
Memory: '256'
MemoryReservation: '32'
Image: !Sub '6xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/pdf_viewer:${ImageTag}'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref 'ECSLogGroup'
awslogs-region: !Ref 'AWS::Region'
awslogs-stream-prefix: '[pdf_viewer]'
- Name: preview
Memory: '256'
MemoryReservation: '32'
Image: !Sub '6xxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/preview:${ImageTag}'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref 'ECSLogGroup'
awslogs-region: !Ref 'AWS::Region'
awslogs-stream-prefix: '[preview]'
- Name: xray-daemon
Memory: '256'
MemoryReservation: '32'
Image: 'amazon/aws-xray-daemon'
PortMappings:
- ContainerPort: 2000
HostPort: 0
Protocol: "udp"
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: !Ref 'ECSLogGroup'
awslogs-region: !Ref 'AWS::Region'
awslogs-stream-prefix: '[xray-daemon]'
ECSLogGroup:
Type: AWS::Logs::LogGroup
Parameters:
CheckAccountId:
Type: String
Description: Should user's account id be checked while logging in to the instance?
Default: 'yes'
Route53HostedZone:
Type: String
SSLCertificateId:
Type: String
Description: Pass SSL id from AWS Certificate Manager to pass to ELB
ApplicationHost:
Type: String
Description: 'Host to be applied as follows: {host}.{Route53HostedZone}'
DBUser:
Type: String
Description: Username that the database should be accessible with
DBPassword:
Type: String
Description: Password that the database user should have
HtpasswdEntry:
Type: String
Description: This is the file that should be htpasswd entry file
DBSnapshot:
Type: String
Description: Database Snapshot ID if you want to restore DB from snapshot
Default: ''
VPCId:
Type: String
Description: VPC Id to assosiate instance to. Pass this if you want to hide the
instances behind pre-existing VPC
Default: vpc-355a6b51
InstanceSubnet:
Type: String
Description: Subnet on which the instance should be set up. Required if VPCId
is set
Default: subnet-beb826c8
SecondarySubnet:
Type: String
Description: Subnet on which the RDS and ElastiCache group will be set up as well.
Required if VPCId is set
Default: subnet-04e39239
AccountId:
Type: String
Description: AccountId. used to filter out users from Auth0
AccountsUrl:
Type: String
Description: Accounts url eg. https://app.getsynapse.com/
SNSTopicArn:
Type: String
Description: ARN of SNS Topic that will be use to communicate between different
parts of the infrastructure
HubspotApiKey:
Type: String
Description: Hubspot api key
UserPool:
Type: String
Description: Cognito UserPool
PoolClients:
Type: String
Description: Cognito PoolClients
JWKS:
Type: String
Description: Cognito JWKS
ImageTag:
Type: String
Description: Tag of docker images
AccountsToken:
Type: String
Description: Token used for authenticating with Accounts
Conditions:
RestoreDB: !Not [!Equals [!Ref 'DBSnapshot', '']]
IsExclusive: !Not [!Equals [!Ref 'AccountId', N/a]]
Outputs:
InstanceURL:
Value: !Join ['', ["https://", !Ref 'ApplicationHost', ., !Ref 'Route53HostedZone']]
Mappings:
AWSRegionToAMI:
us-east-1:
AMIID: ami-a7a242da
us-east-2:
AMIID: ami-b86a5ddd
us-west-1:
AMIID: none
us-west-2:
AMIID: none
eu-west-1:
AMIID: none
eu-central-1:
AMIID: none
ap-northeast-1:
AMIID: none
ap-southeast-1:
AMIID: none
ap-southeast-2:
AMIID: none
答案 0 :(得分:1)
根据评论,该问题似乎与MaximumPercent和MinimumHealthyPercent参数及其默认值200和100 有关:
MaximumPercent:如果服务使用的是滚动更新(ECS)部署类型,则 maximum percent parameter 表示服务中允许的任务数量上限部署期间处于“正在运行”或“正在挂起”状态。
MinimumHealthyPercent:如果服务正在使用滚动更新(ECS)部署类型,则最低健康百分比表示必须保留在服务中的服务中任务数的下限部署过程中的“运行中”状态。
默认值200和100表示,对于大小为6个任务的服务,在部署期间,将有 12个任务在运行。对于容器实例而言,这似乎太多了。
建议的解决方案是将值更改为 150和50 ,从而导致在部署过程中总共运行 6个任务(新的3个,旧的3个),直到部署完成。