Question

使用头盔在Kubernetes集群上部署图表，因为有一天，我无法部署新的或升级现有的。

实际上，每次使用头盔时，我都会收到一条错误消息，告诉我无法安装或升级资源。

如果我运行helm install --name foo . -f values.yaml --namespace foo-namespace，我将得到以下输出：

错误：release foo失败：服务器找不到请求的资源

如果我运行helm upgrade --install foo . -f values.yaml --namespace foo-namespace或helm upgrade foo . -f values.yaml --namespace foo-namespace，则会出现此错误：

错误：升级失败：“ foo”没有部署的版本

我不太明白为什么。

这是我的掌舵版本：

Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}

在我的kubernetes集群上，运行kubectl describe pods tiller-deploy-84b... -n kube-system时，我已经部署了具有相同版本的分till：

Name:               tiller-deploy-84b8...
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
Node:               k8s-worker-1/167.114.249.216
Start Time:         Tue, 26 Feb 2019 10:50:21 +0100
Labels:             app=helm
                    name=tiller
                    pod-template-hash=84b...
Annotations:        <none>
Status:             Running
IP:                 <IP_NUMBER>
Controlled By:      ReplicaSet/tiller-deploy-84b8...
Containers:
  tiller:
    Container ID:   docker://0302f9957d5d83db22...
    Image:          gcr.io/kubernetes-helm/tiller:v2.12.3
    Image ID:       docker-pullable://gcr.io/kubernetes-helm/tiller@sha256:cab750b402d24d...
    Ports:          44134/TCP, 44135/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Running
      Started:      Tue, 26 Feb 2019 10:50:28 +0100
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:44135/liveness delay=1s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:44135/readiness delay=1s timeout=1s period=10s #success=1 #failure=3
    Environment:
      TILLER_NAMESPACE:    kube-system
      TILLER_HISTORY_MAX:  0
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from helm-token-... (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  helm-token-...:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  helm-token-...
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                   Message
  ----    ------     ----  ----                   -------
  Normal  Scheduled  26m   default-scheduler      Successfully assigned kube-system/tiller-deploy-84b86cbc59-kxjqv to worker-1
  Normal  Pulling    26m   kubelet, k8s-worker-1  pulling image "gcr.io/kubernetes-helm/tiller:v2.12.3"
  Normal  Pulled     26m   kubelet, k8s-worker-1  Successfully pulled image "gcr.io/kubernetes-helm/tiller:v2.12.3"
  Normal  Created    26m   kubelet, k8s-worker-1  Created container
  Normal  Started    26m   kubelet, k8s-worker-1  Started container

有人遇到过同样的问题吗？

更新：

这是我的实际图表foo的文件夹结构：图表的结构文件夹：

> templates/
  > deployment.yaml 
  > ingress.yaml
  > service.yaml
> .helmignore
> Chart.yaml 
> values.yaml

我已经尝试使用删除命令helm del --purge foo删除图表，但失败了，但发生了相同的错误。

更确切地说，图表foo实际上是使用我自己的私有注册表的自定义图表。 ImagePullSecret通常正在设置。

我已经运行了这两个命令helm upgrade foo . -f values.yaml --namespace foo-namespace --force | helm upgrade --install foo . -f values.yaml --namespace foo-namespace --force，但仍然出现错误：

UPGRADE FAILED
ROLLING BACK
Error: failed to create resource: the server could not find the requested resource
Error: UPGRADE FAILED: failed to create resource: the server could not find the requested resource

注意foo名称空间已经存在。因此，错误不是源于名称空间名称或名称空间本身。确实，如果我运行helm list，可以看到 foo 图表处于FAILED状态。

Answer 1

我遇到了同样的问题，但清理工作也无济于事，在全新的k8s群集上尝试使用相同的舵图也无济于事。

因此，我发现缺少apiVersion导致了问题。我是通过

找到的

helm install xyz --dry-run

将输出复制到新的 test.yaml 文件并使用

kubectl apply test.yaml

我看到了错误（apiVersion行已移至注释行）

Answer 2

Tiller将所有发行版作为ConfigMap存储在Tiller的命名空间中（在您的情况下为kube-system）。尝试使用以下命令查找损坏的发行版并删除它的ConfigMap：

$ kubectl get cm --all-namespaces -l OWNER=TILLER
NAMESPACE     NAME               DATA   AGE
kube-system   nginx-ingress.v1   1      22h

$ kubectl delete cm  nginx-ingress.v1 -n kube-system

下一步，手动删除所有发布对象（部署，服务，入口等），然后再次使用Helm重新安装发布。

如果没有帮助，您可以尝试下载较新的[Helm]版本（目前为2.13.1）并更新/重新安装Tiller。

Answer 3

我也遇到了同样的问题，但不是因为发行版本损坏。 升级头盔后。似乎新版本的头盔对--wait参数不利。因此，对于面临相同问题的任何人：只需删除--wait，然后从--debug参数中保留helm upgrade即可解决我的问题。

Answer 4

当我尝试使用CronJob部署自定义图表而不是部署时遇到了这个问题。 The error发生在部署脚本的此步骤中。要解决该问题，需要添加ENV变量ROLLOUT_STATUS_DISABLED=true，它在this issue中得到解决。

Kubernetes集群上的头盔安装或升级版本失败：服务器找不到请求的资源或升级失败：没有部署的版本

4 个答案: