Question

我在部署Fargate集群时遇到问题，并且在docker pull镜像上失败，错误为“ CannotPullContainerError”。我正在使用cloudformation创建堆栈，这不是可选的，它会创建完整的堆栈，但是在尝试根据上述错误启动任务时失败。

我已附加了cloudformation堆栈文件，该文件可能会凸显该问题，并且我已经再次检查了子网是否具有到nat（下）的路由。我还将ssh'放入一个可以在外部路由的同一子网中的实例。我想知道我是否没有正确放置所需的部分，即服务+负载平衡器在专用子网中，还是我不应该将内部lb放置在同一子网中？

此子网是当前具有该位置的子网，但文件中的所有3个nat设置都相同。

可路由的子网（subnet-34b92250） * 0.0.0.0/0-> nat-05a00385366da527a

提前加油。

yaml cloudformaition脚本：

AWSTemplateFormatVersion: 2010-09-09
Description: Cloudformation stack for the new GRPC endpoints within existing vpc/subnets and using fargate
Parameters:
  StackName:
    Type: String
    Default: cf-core-ci-grpc
    Description: The name of the parent Fargate networking stack that you created. Necessary
  vpcId:
    Type: String
    Default: vpc-0d499a68
    Description: The name of the parent Fargate networking stack that you created. Necessary
Resources:
  CoreGrcpInstanceSecurityGroupOpenWeb:
    Type: 'AWS::EC2::SecurityGroup'
    Properties:
      GroupName: sgg-core-ci-grpc-ingress
      GroupDescription: Allow http to client host
      VpcId: !Ref vpcId
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: '80'
          ToPort: '80'
          CidrIp: 0.0.0.0/0
      SecurityGroupEgress:
        - IpProtocol: tcp
          FromPort: '80'
          ToPort: '80'
          CidrIp: 0.0.0.0/0
  LoadBalancer:
    Type: 'AWS::ElasticLoadBalancingV2::LoadBalancer'
    DependsOn:
      - CoreGrcpInstanceSecurityGroupOpenWeb
    Properties:
      Name: lb-core-ci-int-grpc
      Scheme: internal
      Subnets:
      # # pub
      #   - subnet-f13995a8
      #   - subnet-f13995a8
      #   - subnet-f13995a8
      # pri
        - subnet-34b92250
        - subnet-82d85af4
        - subnet-ca379b93
      LoadBalancerAttributes:
        - Key: idle_timeout.timeout_seconds
          Value: '50'
      SecurityGroups:
        - !Ref CoreGrcpInstanceSecurityGroupOpenWeb
  TargetGroup:
    Type: 'AWS::ElasticLoadBalancingV2::TargetGroup'
    DependsOn:
      - LoadBalancer
    Properties:
      Name: tg-core-ci-grpc
      Port: 3000
      TargetType: ip
      Protocol: HTTP
      HealthCheckIntervalSeconds: 30
      HealthCheckProtocol: HTTP
      HealthCheckTimeoutSeconds: 10
      HealthyThresholdCount: 4
      Matcher:
        HttpCode: '200'
      TargetGroupAttributes:
        - Key: deregistration_delay.timeout_seconds
          Value: '20'
      UnhealthyThresholdCount: 3
      VpcId: !Ref vpcId
  LoadBalancerListener:
    Type: 'AWS::ElasticLoadBalancingV2::Listener'
    DependsOn:
      - TargetGroup
    Properties:
      DefaultActions:
        - Type: forward
          TargetGroupArn: !Ref TargetGroup
      LoadBalancerArn: !Ref LoadBalancer
      Port: 80
      Protocol: HTTP
  EcsCluster:
    Type: 'AWS::ECS::Cluster'
    DependsOn:
      - LoadBalancerListener
    Properties:
      ClusterName: ecs-core-ci-grpc
  EcsTaskRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              Service:
                # - ecs.amazonaws.com
                - ecs-tasks.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: iam-policy-ecs-task-core-ci-grpc
          PolicyDocument:
            Statement:
              - Effect: Allow
                Action:
                  - 'ecr:**'
                Resource: '*'
  CoreGrcpTaskDefinition:
    Type: 'AWS::ECS::TaskDefinition'
    DependsOn:
      - EcsCluster
      - EcsTaskRole
    Properties:
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      ExecutionRoleArn: !Ref EcsTaskRole
      Cpu: '1024'
      Memory: '2048'
      ContainerDefinitions:
        - Name: container-core-ci-grpc
          Image: 'nginx:latest'
          Cpu: '256'
          Memory: '1024'
          PortMappings:
            - ContainerPort: '80'
              HostPort: '80'
          Essential: 'true'
  EcsService:
    Type: 'AWS::ECS::Service'
    DependsOn:
      - CoreGrcpTaskDefinition
    Properties:
      Cluster: !Ref EcsCluster
      LaunchType: FARGATE
      DesiredCount: '1'
      DeploymentConfiguration:
        MaximumPercent: 150
        MinimumHealthyPercent: 0
      LoadBalancers:
        - ContainerName: container-core-ci-grpc
          ContainerPort: '80'
          TargetGroupArn: !Ref TargetGroup
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: DISABLED
          SecurityGroups:
            - !Ref CoreGrcpInstanceSecurityGroupOpenWeb
          Subnets:
            - subnet-34b92250
            - subnet-82d85af4
            - subnet-ca379b93
      TaskDefinition: !Ref CoreGrcpTaskDefinition

Answer 1

请在您的ECR注册表中定义此策略，并将IAM角色附加到您的任务中。

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "new statement",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::99999999999:role/ecsEventsRole"
            },
            "Action": [
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage",
                "ecr:BatchCheckLayerAvailability",
                "ecr:PutImage",
                "ecr:InitiateLayerUpload",
                "ecr:UploadLayerPart",
                "ecr:CompleteLayerUpload"
            ]
        }
    ]
}

Answer 2

不幸的是，AWS Fargate仅支持ECR托管的映像或Docker Hub中的公共存储库，不支持Docker Hub中托管的私有存储库。
有关更多信息-https://forums.aws.amazon.com/thread.jspa?threadID=268415

几个月前，即使使用AWS Fargate遇到了同样的问题。您现在只有两个选择：

将图像迁移到Amazon ECR。
结合使用AWS Batch和自定义AMI，其中自定义AMI是使用ECS配置中的Docker Hub凭据构建的（我们现在正在使用）。

Fargate在docker pull in private subnet上失败

2 个答案: