无法获取AWS Cloud Provider。 GetCloudProvider改为返回<nil>

时间:2019-05-09 17:42:38

标签: kubernetes jupyterhub

我有一个手动构建的Kubernets集群1.11.4,它使用作为AWS ec2实例,1个主节点和1个小兵运行的CentOS。集群非常稳定。我想将JupyterHub部署到集群中。文档 herehere调出了一些配置EFS的细节。我选择和EBS一起去。

pvc失败并显示:

Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead
Mounted By:  hub-76ffd7d94b-dmj8l

以下是StorageClass定义:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  fsType: ext4

PV Yaml:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: jupyterhub-pv
  labels:
    type: amazonEBS
spec:
  capacity:
    storage: 30Gi
  accessModes:
    - ReadWriteMany
  awsElasticBlockStore:
    volumeID: vol-0ddb700735db435c7
    fsType: ext4

pvc yaml:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: jupyterhub-pvc
  labels:
    type: amazonEBS
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
$ kubectl -n jhub describe pvc hub-db-dir

返回:

Name:          hub-db-dir
Namespace:     jhub
StorageClass:  standard  <========from an earlier try
Status:        Pending
Volume:
Labels:        app=jupyterhub
               chart=jupyterhub-0.8.2
               component=hub
               heritage=Tiller
               release=jhub
Annotations:   volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason              Age                     From                         Message
  ----       ------              ----                    ----                         -------
  Warning    ProvisioningFailed  110s (x106 over 3h43m)  persistentvolume-controller  Failed to provision volume with StorageClass "standard": Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead
Mounted By:  hub-76ffd7d94b-dmj8l

在我看来,这似乎是Pod尝试安装存储的尝试,但失败了。隔离此错误一直是一个挑战。我尝试修补pvc,将存储类更新为gp2,现在将其标记为默认类,但在部署pvc策略时还没有。修补失败:

$ kubectl -n jhub patch pvc hub-db-dir -p '{"spec":{"StorageClass":"gp2"}}'
persistentvolumeclaim/hub-db-dir patched (no change)
$ kubectl -n jhub describe pvc hub-db-dir
Name:          hub-db-dir
Namespace:     jhub
StorageClass:  standard  <====== Not changed
Status:        Pending
Volume:
Labels:        app=jupyterhub
               chart=jupyterhub-0.8.2
               component=hub
               heritage=Tiller
               release=jhub
Annotations:   volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason              Age                      From                         Message
  ----       ------              ----                     ----                         -------
  Warning    ProvisioningFailed  2m26s (x108 over 3h48m)  persistentvolume-controller  Failed to provision volume with StorageClass "standard": Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead
Mounted By:  hub-76ffd7d94b-dmj8l

JupyterHub部署由Helm / tiller管理,因此,进行任何更改时,我将使用以下内容来更新Pod:

$ helm upgrade jhub jupyterhub/jupyterhub --version=0.8.2 -f config.yaml

config.yaml文件中用于分配用户存储的相关部分是:

proxy:
  secretToken: "<random value>"
singleuser:
  cloudMetadata:
    enabled: true
singleuser:
  storage:
    dynamic:
      storageClass: gp2
singleuser:
  storage:
    extraVolumes:
      - name: jupyterhub-pv
        persistentVolumeClaim:
          claimName: jupyterhub-pvc
    extraVolumeMounts:
      - name: jupyterhub-pv
        mountPath: /home/shared

故障排除的一部分还着重于让集群知道其资源是由AWS提供的。为此,我在kubernets配置文件中:

/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

行:

Environment="KUBELET_EXTRA_ARGS=--cloud-provider=aws --cloud-config=/etc/kubernetes/cloud-config.conf

其中:/etc/kubernetes/cloud-config.conf包含:

[Global]
KubernetesClusterTag=kubernetes
KubernetesClusterID=kubernetes

在文件kube-controller-manager.yamlkube-apiserver.yaml中,添加了以下行:

- --cloud-provider=aws

我尚未标记任何AWS资源,但将基于this开始使用它。

下一步要进行故障排除的步骤是什么?

谢谢!

1 个答案:

答案 0 :(得分:0)

也许this链接可以提供帮助吗?

  

您必须将--cloud-provider=aws标志添加到Kubelet中   在将节点添加到群集之前。 AWS集成的关键是   Node对象上的特定字段-.spec.providerID字段-和   如果在以下情况下存在该标志,则只会填充该字段:   节点已添加到集群。如果将节点添加到集群,并且   然后再添加命令行标志,则此字段/值将不会   填充,集成将无法按预期进行。没有错误是   在这种情况下浮出水面(至少,不是我能够   找到)。

     

如果您确实发现自己缺少.spec.providerID字段,   节点对象,您可以使用kubectl edit node命令添加它。的   该字段的值格式为   aws:///<az-of-instance>/<instance-id>