从私有gcr提取Terraform GKE问题

时间:2020-05-01 22:39:24

标签: terraform google-kubernetes-engine service-accounts terraform-provider-gcp gcr.io

由于我无法通过Terraform使用标准GKE集群(请参阅GKE permission issue on gcr.io with service account based on terraform),因此我现在创建了一个带有单独节点池的集群。但是,我仍然无法从eu.gcr.io私人存储库中提取基本容器。

我的地形yml如下。

    resource "google_container_cluster" "primary" {
      name     = "gke-cluster"
      location = "${var.region}-a"

      node_locations = [
        "${var.region}-b",
        "${var.region}-c",
      ]

      network     = var.vpc_name
      subnetwork  = var.subnet_name

      remove_default_node_pool = true
      initial_node_count       = 1
      # minimum kubernetes version for master
      min_master_version = var.min_master_version

      master_auth {
        username = var.gke_master_user
        password = var.gke_master_pass
      }

    }

resource "google_container_node_pool" "primary_preemptible_nodes" {
  name     = "gke-node-pool"
  location = "${var.region}-a"

  cluster     = google_container_cluster.primary.name
  version     = var.node_version
  node_count  = 3

  node_config {
    preemptible  = true

    metadata = {
      disable-legacy-endpoints = "true"
    }

    # based on project number
    service_account = "328126791642-compute@developer.gserviceaccount.com"

    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_only"
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]
  }
}

所有创建都很好。然后,我要使用

部署在群集上

我使用以下yml文件(deployment.yml)创建这些部署

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      component: api
  template:
    metadata:
      labels:
        component: api
    spec:
      containers:
      - name: api
        image: eu.gcr.io/project-dev/api:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 5060

,它继续给出:

Failed to pull image "eu.gcr.io/project-dev/api:latest": rpc error: code = 
Unknown desc = Error response from daemon: pull access denied for eu.gcr.io/project-dev/api, 
repository does not exist or may require 'docker login': denied: Permission denied for 
"latest" from request "/v2/project-dev/lcm_api/manifests/latest".

警告失败94s(x2超过111s)kubelet,gke-cluster-dev-node-pool-90efd247-7vl4错误:ErrImagePull

我在kubernetes集群和

中具有开放的云外壳
docker pull eu.gcr.io/project-dev/api:latest 

工作正常。

我在这里的想法已严重用尽(并考虑移回AWS)。可能与将容器推送到eu.gcr.io的权限有关吗?

我使用:

docker login -u _json_key --password-stdin https://eu.gcr.io < /home/jeroen/.config/gcloud/tf_admin.json

在本地,其中tf_admin.json是创建基础结构项目的我的管理项目的服务帐户。然后我推

docker push eu.gcr.io/project-dev/api:latest   

另一个想法。从文档和其他stackoverflow问题(例如,GKE - ErrImagePull pulling from Google Container Registry)来看,拥有正确的服务帐户和oauth-scope似乎很关键。提款时如何检查其使用的服务帐户正确?以及范围是否正确分配?

1 个答案:

答案 0 :(得分:3)

似乎带有OAuth范围的官方terraform示例已过时,不应使用。我的解决方法是通过OAuth范围授予所有权限,并使用IAM角色来管理它:

    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform",
    ]

您也可以选中similar issue