Pod在https:// api_service_cluster_ip:443上看不到API,我的某些Pod在https:// api_service_cluster_ip:6443上看不到API

时间:2019-03-26 17:35:13

标签: kubernetes

我正在部署以下交互式吊舱

kubectl run -i -t centos7interactive2 --restart=Never --image=centos:7 /bin/bash

然后我尝试从Pod内卷曲API服务器

curl -k https://10.96.0.1:6443/api/v1

此操作从乍得的吊舱失败(挂起):

[root@togo ~]# kubectl describe pod centos7interactive2
Name:               centos7interactive2
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               chad.corp.sensis.com/10.93.98.23
Start Time:         Tue, 26 Mar 2019 13:29:15 -0400
Labels:             run=centos7interactive2
Annotations:        <none>
Status:             Running
IP:                 10.96.2.7
Containers:
  centos7interactive2:
    Container ID:  docker://8b7e301b8e8e2d091bdce641be81cc4dc1413ebab47889fec8102175d399e038
    Image:         centos:7
    Image ID:      docker-pullable://centos@sha256:8d487d68857f5bc9595793279b33d082b03713341ddec91054382641d14db861
    Port:          <none>
    Host Port:     <none>
    Args:
      /bin/bash
    State:          Running
      Started:      Tue, 26 Mar 2019 13:29:16 -0400
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-k2vv5 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-k2vv5:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-k2vv5
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                           Message
  ----    ------     ----  ----                           -------
  Normal  Scheduled  56s   default-scheduler              Successfully assigned default/centos7interactive2 to chad.corp.sensis.com
  Normal  Pulled     55s   kubelet, chad.corp.sensis.com  Container image "centos:7" already present on machine
  Normal  Created    55s   kubelet, chad.corp.sensis.com  Created container
  Normal  Started    55s   kubelet, chad.corp.sensis.com  Started container

此窗格也无法ping 10.96.0.1

如果我再次创建交互式centos pod,它将安排在卡塔尔

[root@togo ~]# kubectl describe pod centos7interactive2
Name:               centos7interactive2
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               qatar.corp.sensis.com/10.93.98.36
Start Time:         Tue, 26 Mar 2019 13:36:23 -0400
Labels:             run=centos7interactive2
Annotations:        <none>
Status:             Running
IP:                 10.96.1.11
Containers:
  centos7interactive2:
    Container ID:  docker://cfc95172944dcd4d643e68ff761f73d32ff1435d674769ddc38da44847a4af88
    Image:         centos:7
    Image ID:      docker-pullable://centos@sha256:8d487d68857f5bc9595793279b33d082b03713341ddec91054382641d14db861
    Port:          <none>
    Host Port:     <none>
    Args:
      /bin/bash
    State:          Running
      Started:      Tue, 26 Mar 2019 13:36:24 -0400
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-k2vv5 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-k2vv5:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-k2vv5
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                            Message
  ----    ------     ----  ----                            -------
  Normal  Scheduled  8s    default-scheduler               Successfully assigned default/centos7interactive2 to qatar.corp.sensis.com
  Normal  Pulled     7s    kubelet, qatar.corp.sensis.com  Container image "centos:7" already present on machine
  Normal  Created    7s    kubelet, qatar.corp.sensis.com  Created container
  Normal  Started    7s    kubelet, qatar.corp.sensis.com  Started container

在这种情况下,ping或卷曲10.96.0.1都没有问题

[root@centos7interactive2 /]# curl -k https://10.96.0.1:6443/api/v1/
{
  "kind": "APIResourceList",
  "groupVersion": "v1",
  "resources": [
    {
      "name": "bindings",
      "singularName": "",
      "namespaced": true,
      "kind": "Binding",
      "verbs": [
        "create"
      ]
    },
    {
      "name": "componentstatuses",
      "singularName": "",
      "namespaced": false,
      "kind": "ComponentStatus",
      "verbs": [
        "get",
        "list"
      ],
      "shortNames": [
        "cs"
      ]
    },
    {
      "name": "configmaps",
      "singularName": "",
      "namespaced": true,
      "kind": "ConfigMap",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "cm"
      ]
    },
    {
      "name": "endpoints",
      "singularName": "",
      "namespaced": true,
      "kind": "Endpoints",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "ep"
      ]
    },
    {
      "name": "events",
      "singularName": "",
      "namespaced": true,
      "kind": "Event",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "ev"
      ]
    },
    {
      "name": "limitranges",
      "singularName": "",
      "namespaced": true,
      "kind": "LimitRange",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "limits"
      ]
    },
    {
      "name": "namespaces",
      "singularName": "",
      "namespaced": false,
      "kind": "Namespace",
      "verbs": [
        "create",
        "delete",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "ns"
      ]
    },
    {
      "name": "namespaces/finalize",
      "singularName": "",
      "namespaced": false,
      "kind": "Namespace",
      "verbs": [
        "update"
      ]
    },
    {
      "name": "namespaces/status",
      "singularName": "",
      "namespaced": false,
      "kind": "Namespace",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "nodes",
      "singularName": "",
      "namespaced": false,
      "kind": "Node",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "no"
      ]
    },
    {
      "name": "nodes/proxy",
      "singularName": "",
      "namespaced": false,
      "kind": "NodeProxyOptions",
      "verbs": [
        "create",
        "delete",
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "nodes/status",
      "singularName": "",
      "namespaced": false,
      "kind": "Node",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "persistentvolumeclaims",
      "singularName": "",
      "namespaced": true,
      "kind": "PersistentVolumeClaim",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "pvc"
      ]
    },
    {
      "name": "persistentvolumeclaims/status",
      "singularName": "",
      "namespaced": true,
      "kind": "PersistentVolumeClaim",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "persistentvolumes",
      "singularName": "",
      "namespaced": false,
      "kind": "PersistentVolume",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "pv"
      ]
    },
    {
      "name": "persistentvolumes/status",
      "singularName": "",
      "namespaced": false,
      "kind": "PersistentVolume",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "pods",
      "singularName": "",
      "namespaced": true,
      "kind": "Pod",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "po"
      ],
      "categories": [
        "all"
      ]
    },
    {
      "name": "pods/attach",
      "singularName": "",
      "namespaced": true,
      "kind": "PodAttachOptions",
      "verbs": [
        "create",
        "get"
      ]
    },
    {
      "name": "pods/binding",
      "singularName": "",
      "namespaced": true,
      "kind": "Binding",
      "verbs": [
        "create"
      ]
    },
    {
      "name": "pods/eviction",
      "singularName": "",
      "namespaced": true,
      "group": "policy",
      "version": "v1beta1",
      "kind": "Eviction",
      "verbs": [
        "create"
      ]
    },
    {
      "name": "pods/exec",
      "singularName": "",
      "namespaced": true,
      "kind": "PodExecOptions",
      "verbs": [
        "create",
        "get"
      ]
    },
    {
      "name": "pods/log",
      "singularName": "",
      "namespaced": true,
      "kind": "Pod",
      "verbs": [
        "get"
      ]
    },
    {
      "name": "pods/portforward",
      "singularName": "",
      "namespaced": true,
      "kind": "PodPortForwardOptions",
      "verbs": [
        "create",
        "get"
      ]
    },
    {
      "name": "pods/proxy",
      "singularName": "",
      "namespaced": true,
      "kind": "PodProxyOptions",
      "verbs": [
        "create",
        "delete",
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "pods/status",
      "singularName": "",
      "namespaced": true,
      "kind": "Pod",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "podtemplates",
      "singularName": "",
      "namespaced": true,
      "kind": "PodTemplate",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ]
    },
    {
      "name": "replicationcontrollers",
      "singularName": "",
      "namespaced": true,
      "kind": "ReplicationController",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "rc"
      ],
      "categories": [
        "all"
      ]
    },
    {
      "name": "replicationcontrollers/scale",
      "singularName": "",
      "namespaced": true,
      "group": "autoscaling",
      "version": "v1",
      "kind": "Scale",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "replicationcontrollers/status",
      "singularName": "",
      "namespaced": true,
      "kind": "ReplicationController",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "resourcequotas",
      "singularName": "",
      "namespaced": true,
      "kind": "ResourceQuota",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "quota"
      ]
    },
    {
      "name": "resourcequotas/status",
      "singularName": "",
      "namespaced": true,
      "kind": "ResourceQuota",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "secrets",
      "singularName": "",
      "namespaced": true,
      "kind": "Secret",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ]
    },
    {
      "name": "serviceaccounts",
      "singularName": "",
      "namespaced": true,
      "kind": "ServiceAccount",
      "verbs": [
        "create",
        "delete",
        "deletecollection",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "sa"
      ]
    },
    {
      "name": "services",
      "singularName": "",
      "namespaced": true,
      "kind": "Service",
      "verbs": [
        "create",
        "delete",
        "get",
        "list",
        "patch",
        "update",
        "watch"
      ],
      "shortNames": [
        "svc"
      ],
      "categories": [
        "all"
      ]
    },
    {
      "name": "services/proxy",
      "singularName": "",
      "namespaced": true,
      "kind": "ServiceProxyOptions",
      "verbs": [
        "create",
        "delete",
        "get",
        "patch",
        "update"
      ]
    },
    {
      "name": "services/status",
      "singularName": "",
      "namespaced": true,
      "kind": "Service",
      "verbs": [
        "get",
        "patch",
        "update"
      ]
    }
  ]
}

在这种情况下,我可以轻松到达10.96.0.1 这两个节点看起来都很健康,但是始终阻止我的Pod通过其ClusterIP地址到达主节点

[root@togo work]# kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://10.93.98.204:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED

我的集群看起来很健康。

[root@togo work]# kubectl get all --all-namespaces
NAMESPACE     NAME                                               READY   STATUS    RESTARTS   AGE
kube-system   pod/coredns-86c58d9df4-jjgpn                       1/1     Running   1          5d22h
kube-system   pod/coredns-86c58d9df4-n6lcv                       1/1     Running   1          5d22h
kube-system   pod/etcd-togo.corp.sensis.com                      1/1     Running   1          5d22h
kube-system   pod/kube-apiserver-togo.corp.sensis.com            1/1     Running   1          5d22h
kube-system   pod/kube-controller-manager-togo.corp.sensis.com   1/1     Running   1          5d22h
kube-system   pod/kube-flannel-ds-amd64-6759k                    1/1     Running   0          26h
kube-system   pod/kube-flannel-ds-amd64-fxpv9                    1/1     Running   1          5d22h
kube-system   pod/kube-flannel-ds-amd64-n6zk9                    1/1     Running   0          5d22h
kube-system   pod/kube-flannel-ds-amd64-rbbms                    1/1     Running   0          26h
kube-system   pod/kube-flannel-ds-amd64-shqnr                    1/1     Running   1          5d22h
kube-system   pod/kube-flannel-ds-amd64-tqkgw                    1/1     Running   0          26h
kube-system   pod/kube-proxy-h9jpr                               1/1     Running   1          5d22h
kube-system   pod/kube-proxy-m567z                               1/1     Running   0          26h
kube-system   pod/kube-proxy-t6swp                               1/1     Running   0          26h
kube-system   pod/kube-proxy-tlfjd                               1/1     Running   0          26h
kube-system   pod/kube-proxy-vzdpl                               1/1     Running   1          5d22h
kube-system   pod/kube-proxy-xn5dv                               1/1     Running   0          5d22h
kube-system   pod/kube-scheduler-togo.corp.sensis.com            1/1     Running   1          5d22h
kube-system   pod/tiller-deploy-5b7c66d59c-k9xkv                 1/1     Running   1          5d22h

NAMESPACE     NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
default       service/kubernetes      ClusterIP   10.96.0.1       <none>        443/TCP         5d22h
kube-system   service/kube-dns        ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP   5d22h
kube-system   service/tiller-deploy   ClusterIP   10.105.40.102   <none>        44134/TCP       5d22h

NAMESPACE     NAME                                     DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
kube-system   daemonset.apps/kube-flannel-ds-amd64     6         6         6       6            6           beta.kubernetes.io/arch=amd64     5d22h
kube-system   daemonset.apps/kube-flannel-ds-arm       0         0         0       0            0           beta.kubernetes.io/arch=arm       5d22h
kube-system   daemonset.apps/kube-flannel-ds-arm64     0         0         0       0            0           beta.kubernetes.io/arch=arm64     5d22h
kube-system   daemonset.apps/kube-flannel-ds-ppc64le   0         0         0       0            0           beta.kubernetes.io/arch=ppc64le   5d22h
kube-system   daemonset.apps/kube-flannel-ds-s390x     0         0         0       0            0           beta.kubernetes.io/arch=s390x     5d22h
kube-system   daemonset.apps/kube-proxy                6         6         6       6            6           <none>                            5d22h

NAMESPACE     NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/coredns         2/2     2            2           5d22h
kube-system   deployment.apps/tiller-deploy   1/1     1            1           5d22h

NAMESPACE     NAME                                       DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/coredns-86c58d9df4         2         2         2       5d22h
kube-system   replicaset.apps/tiller-deploy-5b7c66d59c   1         1         1       5d22h
[root@togo work]# kubectl get nodes -o wide
NAME                    STATUS   ROLES    AGE     VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION               CONTAINER-RUNTIME
benin.corp.sensis.com   Ready    <none>   26h     v1.13.4   10.93.97.123    <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64        docker://18.9.3
chad.corp.sensis.com    Ready    <none>   5d22h   v1.13.4   10.93.98.23     <none>        CentOS Linux 7 (Core)   3.10.0-957.10.1.el7.x86_64   docker://18.9.3
qatar.corp.sensis.com   Ready    <none>   5d22h   v1.13.4   10.93.98.36     <none>        CentOS Linux 7 (Core)   3.10.0-957.10.1.el7.x86_64   docker://18.9.3
spain.corp.sensis.com   Ready    <none>   26h     v1.13.4   10.93.103.236   <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64        docker://18.9.3
togo.corp.sensis.com    Ready    master   5d22h   v1.13.4   10.93.98.204    <none>        CentOS Linux 7 (Core)   3.10.0-957.5.1.el7.x86_64    docker://18.9.3
tonga.corp.sensis.com   Ready    <none>   26h     v1.13.4   10.93.97.202    <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64        docker://18.9.3

尽管上面显示了以下服务,但是我没有一个单独的问题是pod不能在https://10.96.0.1:443上请求API(但是我可以直接卷曲6443)

default       service/kubernetes      ClusterIP   10.96.0.1       <none>        443/TCP

有人可以帮我找出这两个问题吗

  1. 为什么无法加入https://10.96.0.1:6443
  2. 为什么乍得和卡塔尔都无法进入https://10.96.0.1:443

2 个答案:

答案 0 :(得分:0)

首先,ping不能与群集IP(例如10.96.0.1)一起使用,因为仅转发特定的TCP或UDP端口,而不转发ICMP流量。

为稍微帮助您的调试工作,我可以确认https://10.96.0.1:443应该可以在您的任何Pod中使用(实际上也可以在您的任何Node中使用)。如果执行kubectl get ep kubernetes,它将显示10.93.98.204:6443作为目标。而且,正如您所测试的那样,您应该也可以从Pod和Nodes达到(https://10.93.98.204:6443)。如果不能,也许是某个地方的防火墙问题。

其次,覆盖网络设置可能存在问题。我注意到您开始使用的Pod拥有10.9.6.2.7和10.96.1.11之类的IP,这表明覆盖网络(法兰)可能已配置为10.96.0.0/16。从Kubernetes服务的IP地址(又名群集IP)的地址10.9.6.0.1开始,似乎网络也配置了10.96.0.0/X。这很可能是错误的,覆盖网络和服务网络(也称为群集网络)之间应该没有重叠。当然,这只是一个猜测,因为您的问题中没有足够的信息(否则,这些信息会非常详细和格式化!)

我建议您从头开始,因为重新配置这些网络范围并非易事。

答案 1 :(得分:0)

感谢@Janus Lenart,我已经取得了进步。

我遵循了他的建议,并使用10.244.0.0/16的pod-network-cidr重置了集群。现在,我既可以使用公共地址,也可以使用集群地址来访问API服务

解决方法主要是使用正确的pod-network-cidr。如Janus所示,默认的法兰绒Yaml将其指定为10.244.0.0/16。

kubeadm init --apiserver-advertise-address=10.93.98.204 --pod-network-cidr=10.244.0.0/16

添加单个节点后的集群配置

[root@togo dsargrad]# kubectl config view
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://10.93.98.204:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: kubernetes-admin
  name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
  user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED

所有命名空间中的默认群集服务:

[root@togo dsargrad]# kubectl get svc --all-namespaces
NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
default       kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP         23m
kube-system   kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP   22m

然后我在交互式shell中运行centos7:

kubectl run -i -t centos7interactive2 --restart=Never --image=centos:7 /bin/bash

然后,我尝试在cluster-ip(10.96.0.1:443)及其公用地址(10.93.98.204:6443)上卷曲API服务器

这些连接成功,但是我确实看到证书错误。

在其公开地址上

[root@centos7interactive2 /]#  curl https://10.93.98.204:6443
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

并在其群集地址上

[root@centos7interactive2 /]# curl https://10.96.0.1:443
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

此证书错误是预期的吗?还是我错过了一步?