在GKE中使用cloudsql-proxy时无法获取令牌错误

时间:2019-01-04 22:53:50

标签: kubernetes google-cloud-platform google-cloud-sql google-kubernetes-engine cloud-sql-proxy

我正在使用启用了istio附加组件的GKE。使用websocket时,Myapp以某种方式给出503错误。我开始认为也许websocket在起作用,但是数据库连接却无法正常工作,并导致503的错误,因为cloudsql-proxy日志给出了错误:

$ kubectl logs myapp-54d6696fb4-bmp5m cloudsql-proxy
2019/01/04 21:56:47 using credential file for authentication; email=proxy-user@myproject.iam.gserviceaccount.com
2019/01/04 21:56:47 Listening on 127.0.0.1:5432 for myproject:europe-west4:mydatabase
2019/01/04 21:56:47 Ready for new connections
2019/01/04 21:56:51 New connection for "myproject:europe-west4:mydatabase"
2019/01/04 21:56:51 couldn't connect to "myproject:europe-west4:mydatabase": Post https://www.googleapis.com/sql/v1beta4/projects/myproject/instances/mydatabase/createEphemeral?alt=json: oauth2: cannot fetch token: Post https://oauth2.googleapis.com/token: read tcp 10.44.11.21:60728->108.177.126.95:443: read: connection reset by peer
2019/01/04 22:14:56 New connection for "myproject:europe-west4:mydatabase"
2019/01/04 22:14:56 couldn't connect to "myproject:europe-west4:mydatabase": Post https://www.googleapis.com/sql/v1beta4/projects/myproject/instances/mydatabase/createEphemeral?alt=json: oauth2: cannot fetch token: Post https://oauth2.googleapis.com/token: read tcp 10.44.11.21:36734->108.177.127.95:443: read: connection reset by peer

所需的身份验证详细信息应该在我创建的代理服务帐户的凭据中,并因此提供给:

{
  "type": "service_account",
  "project_id": "myproject",
  "private_key_id": "myprivekeyid",
  "private_key": "-----BEGIN PRIVATE KEY-----\MYPRIVATEKEY-----END PRIVATE KEY-----\n",
  "client_email": "proxy-user@myproject.iam.gserviceaccount.com",
  "client_id": "myclientid",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/proxy-user%40myproject.iam.gserviceaccount.com"
}

我的问题: 如何摆脱错误/从GKE获得正确的Google sql配置?

在创建集群时,我选择了mTLS的“ permissive”选项。

我的配置: myapp_and_router.yaml:

apiVersion: v1
kind: Service
metadata:
  name: myapp
  labels:
    app: myapp
spec:
  ports:
  - port: 8089
    # 'name: http' apparently does not work
    name: db
  selector:
    app: myapp    
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  labels:
    app: myapp
spec:
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
        - name: myapp
          image: gcr.io/myproject/firstapp:v1
          imagePullPolicy: Always
          ports:
            - containerPort: 8089
          env:
            - name: POSTGRES_DB_HOST
              value: 127.0.0.1:5432
            - name: POSTGRES_DB_USER
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: username
            - name: POSTGRES_DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: mysecret
                  key: password
          ## Custom healthcheck for Ingress
          readinessProbe:
            httpGet:
              path: /healthz
              scheme: HTTP
              port: 8089
            initialDelaySeconds: 5
            timeoutSeconds: 5
          livenessProbe:
            httpGet:
              path: /healthz
              scheme: HTTP
              port: 8089
            initialDelaySeconds: 5
            timeoutSeconds: 20             
        - name: cloudsql-proxy
          image: gcr.io/cloudsql-docker/gce-proxy:1.11
          command: ["/cloud_sql_proxy",
                    "-instances=myproject:europe-west4:mydatabase=tcp:5432",
                    "-credential_file=/secrets/cloudsql/credentials.json"]
          securityContext:
            runAsUser: 2
            allowPrivilegeEscalation: false
          volumeMounts:
            - name: cloudsql-instance-credentials
              mountPath: /secrets/cloudsql
              readOnly: true
      volumes:
        - name: cloudsql-instance-credentials
          secret:
            secretName: cloudsql-instance-credentials
---
###########################################################################
# Ingress resource (gateway)
##########################################################################
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: myapp-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      # 'name: http' apparently does not work
      name: db 
      protocol: HTTP
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - "*"
  gateways:
  - myapp-gateway
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: myapp
      weight: 100
    websocketUpgrade: true
---

编辑1:创建集群时,我没有为各种Google服务启用权限(范围),请参见here。在创建具有权限的新集群之后,我现在得到一个新的错误消息:

kubectl logs mypod cloudsql-proxy
2019/01/11 20:39:58 using credential file for authentication; email=proxy-user@myproject.iam.gserviceaccount.com
2019/01/11 20:39:58 Listening on 127.0.0.1:5432 for myproject:europe-west4:mydatabase
2019/01/11 20:39:58 Ready for new connections
2019/01/11 20:40:12 New connection for "myproject:europe-west4:mydatabase"
2019/01/11 20:40:12 couldn't connect to "myproject:europe-west4:mydatabase": Post https://www.googleapis.com/sql/v1beta4/projects/myproject/instances/mydatabase/createEphemeral?alt=json: oauth2: cannot fetch token: 400 Bad Request
Response: {
  "error": "invalid_grant",
  "error_description": "Invalid JWT Signature." 
}

编辑2:看起来新错误是由于服务帐户密钥不再有效引起的。制作完新文件后,我可以连接到数据库!

2 个答案:

答案 0 :(得分:2)

由于MySQL和PostgreSQL基于TCP / IP协议(或在特定情况下为unix套接字),而Postgres不使用HTTP,因此问题出在服务的端口名称上。

首先,尝试更改端口名称,例如,可以将其更改为“ db”。 另一个解决方法是使用jdbc Socket Factory连接到CloudSQL Mysql并稍有不同:

  • cloud-sql-instance服务条目中没有“地址字段” / CIDR块
  • 解决方案:允许连接到CloudSQL实例的服务条目的DNS

答案 1 :(得分:2)

我看到了类似的错误,但是通过创建以下服务条目(在https://github.com/istio/istio/issues/6593#issuecomment-420591213的帮助下),能够在我的GKE的istio群集中使cloudsql-proxy工作:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: google-apis
spec:
  hosts:
  - "*.googleapis.com"
  ports:
  - name: https
    number: 443
    protocol: HTTPS
---
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: cloudsql-instances
spec:
  hosts:
  # Use `gcloud sql instances list` to get the addresses of instances
  - 35.226.125.82
  ports:
  - name: tcp
    number: 3307
    protocol: TCP

此外,我仍然在初始化期间看到这些连接错误,直到我在应用程序启动中添加了延迟(运行服务器之前的sleep 10),以便istio-proxy和cloudsql-proxy容器有时间先设置。

编辑1:这里是有错误的日志,然后在一切正常的情况下成功显示“新连接/客户端关闭”行:

2019/01/10 21:54:38 New connection for "my-project:us-central1:my-db"
2019/01/10 21:54:38 Throttling refreshCfg(my-project:us-central1:my-db): it was only called 44.445553175s ago
2019/01/10 21:54:38 couldn't connect to "my-project:us-central1:my-db": Post https://www.googleapis.com/sql/v1beta4/projects/my-project/instances/my-db/createEphemeral?alt=json: oauth2: cannot fetch token: Post https://accounts.google.com/o/oauth2/token: dial tcp 108.177.112.84:443: getsockopt: connection refused
2019/01/10 21:54:38 New connection for "my-project:us-central1:my-db"
2019/01/10 21:54:38 Throttling refreshCfg(my-project:us-central1:my-db): it was only called 44.574562959s ago
2019/01/10 21:54:38 couldn't connect to "my-project:us-central1:my-db": Post https://www.googleapis.com/sql/v1beta4/projects/my-project/instances/my-db/createEphemeral?alt=json: oauth2: cannot fetch token: Post https://accounts.google.com/o/oauth2/token: dial tcp 108.177.112.84:443: getsockopt: connection refused
2019/01/10 21:55:15 New connection for "my-project:us-central1:my-db"
2019/01/10 21:55:16 Client closed local connection on 127.0.0.1:5432
2019/01/10 21:55:17 New connection for "my-project:us-central1:my-db"
2019/01/10 21:55:17 New connection for "my-project:us-central1:my-db"
2019/01/10 21:55:27 Client closed local connection on 127.0.0.1:5432
2019/01/10 21:55:28 New connection for "my-project:us-central1:my-db"
2019/01/10 21:55:30 Client closed local connection on 127.0.0.1:5432
2019/01/10 21:55:37 Client closed local connection on 127.0.0.1:5432
2019/01/10 21:55:38 New connection for "my-project:us-central1:my-db"
2019/01/10 21:55:40 Client closed local connection on 127.0.0.1:5432

编辑2:确保Cloud SQL api在您的集群范围之内。