保险柜初始化失败

时间:2020-10-28 03:57:55

标签: kubernetes-helm consul hashicorp-vault

我正在尝试使用TLS和Consul作为EKS的后端在通过HA模式通过Helm在HA模式下部署的Kubernetes上初始化Vault。

Helm 3.0.2
Kubernetes 1.17
Vault 1.5.4
Vault Helm Chart Version 0.8.0
Consul 1.8.4
Consul Helm Chart Version 0.25.0

该集群通过eksctl部署在AWS EKS上。

我看到的错误

helm install -f values.yaml hashicorp/vault
kubectl exec -it vault-o -- vault operator init
Error initializing: Put "https://127.0.0.1:8200/v1/sys/init": dial tcp 127.0.0.1:8200: connect: connection refused

我的values.yaml

global:
  tlsDisable: false
injector:
  metrics:
    enabled: true
server:
  extraSecretEnvironmentVars:
    - envName: AWS_ACCESS_KEY_ID
      secretName: eks-creds
      secretKey: AWS_ACCESS_KEY_ID
    - envName: AWS_SECRET_ACCESS_KEY
      secretName: eks-creds
      secretKey: AWS_SECRET_ACCESS_KEY
  extraVolumes:
    - type: secret
      name: vault-tls
    - type: secret
      name: eks-creds
  standalone:
    enable: false
  ha:
    enable: true
    config: |
      ui = true
      api_addr = "[::]:8200"
      cluster_addr = "[::]:8201"
      listener "tcp" {
        tls_disable = 0
        tls_cert_file = "/vault/userconfig/vault-tls/vault.crt"
        tls_key_file = "/vault/userconfig/vault-tls/vault.key"
        tls_client_ca_file = "/vault/userconfig/vault-tls/vault.ca"
        address = "[::]:8200"
        cluster_address = "[::]:8201"
      }
      storage "consul" {
        path = "vault"
        address = "HOST_IP:8501"
      }
      disable_mlock = true
      service_registration "kubernetes" {}
ui:
  enabled: true

我尝试将api_addrcluster_addraddresscluster_address更改为PID_IPhttps://127.0.0.1:8200vault.default.svc.cluster.localhttps://0.0.0.0:8200以及其他一些变体。太多我真的没有追踪。所有具有相同的问题。我执行到容器并检查VAULT_ADDR env var。它总是在我在配置中设置时反映出来的。甚至将容器中的VAULT_ADDR更新为Vault Kubernetes服务ClusterIP。所以我怀疑是这个。

库容器中的某些输出

/ $ env | grep VAULT_ADDR
/ $ VAULT_ADDR=https://127.0.0.1:8200
/ $ env | grep VAULT_CACERT
/ $ export VAULT_CACERT="/vault/userconfig/vault-tls/vault.ca"
/ $ ls -lAh /vault/userconfig/vault-tls/
total 0
drwxr-sr-x    2 root     vault        100 Oct 27 18:21 ..2020_10_27_18_21_40.431431606
lrwxrwxrwx    1 root     root          31 Oct 27 18:21 ..data -> ..2020_10_27_18_21_40.431431606
lrwxrwxrwx    1 root     root          15 Oct 27 18:21 vault.ca -> ..data/vault.ca
lrwxrwxrwx    1 root     root          16 Oct 27 18:21 vault.crt -> ..data/vault.crt
lrwxrwxrwx    1 root     root          16 Oct 27 18:21 vault.key -> ..data/vault.key

我尝试检查容器内的打开/监听端口

/ $ netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
/ $ 

这似乎很奇怪。但是,我的任何超越值怎么会导致这种情况?所以我觉得这在某种程度上是故意的。但也许不是。

/ $ wget https://127.0.0.1:8200
Connecting to 127.0.0.1:8200 (127.0.0.1:8200)
wget: can't connect to remote host (127.0.0.1): Connection refused

kubectl port-forward vault-0 8200

curl -v https://localhost:8200
*   Trying ::1:8200...
* Connected to localhost (::1) port 8200 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to localhost:8200
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to localhost:8200

这是一些kubernetes的输出

kubectl get pods
vault-0                                          0/1     Running   0          4s
vault-1                                          0/1     Running   0          4s
vault-2                                          0/1     Running   0          4s
vault-agent-injector-7bd76b66dd-8gjf4            1/1     Running   0          4s

kubectl get service
vault                                 ClusterIP   10.100.165.241   <none>        8200/TCP,8201/TCP                                                         2m1s
vault-active                          ClusterIP   10.100.22.197    <none>        8200/TCP,8201/TCP                                                         2m1s
vault-agent-injector-svc              ClusterIP   10.100.93.216    <none>        443/TCP                                                                   2m1s
vault-internal                        ClusterIP   None             <none>        8200/TCP,8201/TCP                                                         2m1s
vault-standby                         ClusterIP   10.100.96.141    <none>        8200/TCP,8201/TCP                                                         2m1s
vault-ui                              ClusterIP   10.100.219.196   <none>        8200/TCP                                                                  2m1s

kubectl logs vault-0

WARNING! Unable to read storage migration status.
2020-10-28T03:31:47.639Z [INFO]  proxy environment: http_proxy= https_proxy= no_proxy=
2020-10-28T03:31:47.640Z [WARN]  storage.consul: appending trailing forward slash to path
2020-10-28T03:31:47.640Z [WARN]  storage migration check error: error="Unexpected response code: 400"
kubectl describe pod vault-0
Name:         vault-0
Namespace:    default
Priority:     0
Node:         ip-172-16-3-151.ec2.internal/172.16.3.151
Start Time:   Tue, 27 Oct 2020 20:31:46 -0700
Labels:       app.kubernetes.io/instance=vault
              app.kubernetes.io/name=vault
              component=server
              controller-revision-hash=vault-5dd945bc6c
              helm.sh/chart=vault-0.8.0
              statefulset.kubernetes.io/pod-name=vault-0
Annotations:  kubernetes.io/psp: eks.privileged
Status:       Running
IP:           172.16.3.19
IPs:
  IP:           172.16.3.19
Controlled By:  StatefulSet/vault
Containers:
  vault:
    Container ID:  docker://1c70a35dbb2c870d21616e825200bb766fc10ab8076cd70931305f56386891fb
    Image:         vault:1.5.4
    Image ID:      docker-pullable://vault@sha256:121c1eb16a474f5a4c1d92256184dae333ab7284f8c744d4e2754300f84f68f0
    Ports:         8200/TCP, 8201/TCP, 8202/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Command:
      /bin/sh
      -ec
    Args:
      cp /vault/config/extraconfig-from-values.hcl /tmp/storageconfig.hcl;
      [ -n "${HOST_IP}" ] && sed -Ei "s|HOST_IP|${HOST_IP?}|g" /tmp/storageconfig.hcl;
      [ -n "${POD_IP}" ] && sed -Ei "s|POD_IP|${POD_IP?}|g" /tmp/storageconfig.hcl;
      [ -n "${HOSTNAME}" ] && sed -Ei "s|HOSTNAME|${HOSTNAME?}|g" /tmp/storageconfig.hcl;
      [ -n "${API_ADDR}" ] && sed -Ei "s|API_ADDR|${API_ADDR?}|g" /tmp/storageconfig.hcl;
      [ -n "${TRANSIT_ADDR}" ] && sed -Ei "s|TRANSIT_ADDR|${TRANSIT_ADDR?}|g" /tmp/storageconfig.hcl;
      [ -n "${RAFT_ADDR}" ] && sed -Ei "s|RAFT_ADDR|${RAFT_ADDR?}|g" /tmp/storageconfig.hcl;
      /usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl

    State:          Running
      Started:      Tue, 27 Oct 2020 20:31:47 -0700
    Ready:          False
    Restart Count:  0
    Readiness:      exec [/bin/sh -ec vault status -tls-skip-verify] delay=5s timeout=3s period=5s #success=1 #failure=2
    Environment:
      HOST_IP:                 (v1:status.hostIP)
      POD_IP:                  (v1:status.podIP)
      VAULT_K8S_POD_NAME:     vault-0 (v1:metadata.name)
      VAULT_K8S_NAMESPACE:    default (v1:metadata.namespace)
      VAULT_ADDR:             https://127.0.0.1:8200
      VAULT_API_ADDR:         https://$(POD_IP):8200
      SKIP_CHOWN:             true
      SKIP_SETCAP:            true
      HOSTNAME:               vault-0 (v1:metadata.name)
      VAULT_CLUSTER_ADDR:     https://$(HOSTNAME).vault-internal:8201
      HOME:                   /home/vault
      AWS_ACCESS_KEY_ID:      <set to the key 'AWS_ACCESS_KEY_ID' in secret 'eks-creds'>      Optional: false
      AWS_SECRET_ACCESS_KEY:  <set to the key 'AWS_SECRET_ACCESS_KEY' in secret 'eks-creds'>  Optional: false
    Mounts:
      /home/vault from home (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from vault-token-kfmn5 (ro)
      /vault/config from config (rw)
      /vault/userconfig/eks-creds from userconfig-eks-creds (ro)
      /vault/userconfig/vault-tls from userconfig-vault-tls (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      vault-config
    Optional:  false
  userconfig-vault-tls:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  vault-tls
    Optional:    false
  userconfig-eks-creds:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  eks-creds
    Optional:    false
  home:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  vault-token-kfmn5:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  vault-token-kfmn5
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                  From                                   Message
  ----     ------     ----                 ----                                   -------
  Normal   Scheduled  2m16s                default-scheduler                      Successfully assigned default/vault-0 to ip-172-16-3-151.ec2.internal
  Normal   Pulled     2m15s                kubelet, ip-172-16-3-151.ec2.internal  Container image "vault:1.5.4" already present on machine
  Normal   Created    2m15s                kubelet, ip-172-16-3-151.ec2.internal  Created container vault
  Normal   Started    2m15s                kubelet, ip-172-16-3-151.ec2.internal  Started container vault
  Warning  Unhealthy  21s (x22 over 2m5s)  kubelet, ip-172-16-3-151.ec2.internal  Readiness probe failed: Error checking seal status: Get "https://127.0.0.1:8200/v1/sys/seal-status": dial tcp 127.0.0.1:8200: connect: connection refused

我将端口转发到localhost,并在浏览器中收到PR_END_OF_FILE_ERROR错误。

而且我总是有相同的错误。我了解为什么vault operator init在抱怨。但是我不明白为什么就保管库的配置而言,该端口为何不允许与其连接。我没有足够的文档来阅读文件,并且为具有类似错误的类似设置的其他人提供帮助。在大多数情况下,当我发现其他人遇到此问题时。他们只需要导出VAULT_ADDR即可解决问题。有什么想法吗?

1 个答案:

答案 0 :(得分:0)

您是否使用awskms来打开保管库?检查您在该Pod上的IAM用户/角色是否可以达到kms?