准备就绪探针不允许在Pod未准备好时访问内部kubernetes服务

时间:2019-08-26 23:11:16

标签: kubernetes hazelcast readinessprobe

就绪探针将应用程序保持在未就绪状态。在这种状态下,应用程序无法连接到任何kubernetes服务。

我正在为我的kubernetes集群的主节点和节点使用Ubuntu 18。 (当我在集群中仅使用master时,该问题仍然出现,因此我认为这不是master节点的问题。)

我使用Spring应用程序设置了kubernetes集群,该应用程序使用hazelcast来管理缓存。因此,在使用就绪探针时,应用程序无法访问我创建的kubernetes服务,以便使用hazelcast-kubernetes插件通过hazelcast连接应用程序。

当我取出准备就绪探针时,该应用程序会尽快连接到成功创建hazelcast群集的服务,并且一切正常。

就绪探针将连接到rest api,它的唯一响应是200代码。但是,当应用程序启动时,在此过程的中间,它将启动hazelcast集群,因此,它将尝试连接到kubernetes hazelcast服务,该服务将应用程序的缓存与其他Pod连接起来,而准备就绪探针还没有未被清除,并且由于探测,吊舱处于未就绪状态。此时,应用程序将无法连接到kubernetes服务,并且由于我添加的配置而失败或卡住。

service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: my-app-cluster-hazelcast
spec:
  selector:
    app: my-app
  ports:
  - name: hazelcast
    port: 5701

deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app-deployment
  labels:
    app: my-app-deployment
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 2
  replicas: 2
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      terminationGracePeriodSeconds: 180
      containers:
      - name: my-app
        image: my-repo:5000/my-app-container
        imagePullPolicy: Always
        ports:
        - containerPort: 5701
        - containerPort: 9080
        readinessProbe:
          httpGet:
            path: /app/api/excluded/sample
            port: 9080
          initialDelaySeconds: 120
          periodSeconds: 15
        securityContext:
          capabilities:
            add:
              - SYS_ADMIN
        env:
          - name: container
            value: docker

hazelcast.xml:

<?xml version="1.0" encoding="UTF-8"?>

<hazelcast
        xsi:schemaLocation="http://www.hazelcast.com/schema/config http://www.hazelcast.com/schema/config/hazelcast-config-3.11.xsd"
        xmlns="http://www.hazelcast.com/schema/config"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <properties>
        <property name="hazelcast.jmx">false</property>
        <property name="hazelcast.logging.type">slf4j</property>
    </properties>

    <network>
        <port auto-increment="false">5701</port>
            <outbound-ports>
                <ports>49000,49001,49002,49003</ports>
            </outbound-ports>
        <join>
            <multicast enabled="false"/>
            <kubernetes enabled="true">
                <namespace>default</namespace>
                <service-name>my-app-cluster-hazelcast</service-name>
            </kubernetes>
        </join>
    </network>
</hazelcast>

hazelcast-client.xml:

<?xml version="1.0" encoding="UTF-8"?>
<hazelcast-client
        xsi:schemaLocation="http://www.hazelcast.com/schema/client-config http://www.hazelcast.com/schema/client-config/hazelcast-client-config-3.11.xsd"
        xmlns="http://www.hazelcast.com/schema/client-config"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <properties>
        <property name="hazelcast.logging.type">slf4j</property>
    </properties>

    <connection-strategy async-start="false" reconnect-mode="ON">
        <connection-retry enabled="true">
            <initial-backoff-millis>1000</initial-backoff-millis>
            <max-backoff-millis>60000</max-backoff-millis>
        </connection-retry>
    </connection-strategy>

    <network>
        <kubernetes enabled="true">
            <namespace>default</namespace>
            <service-name>my-app-cluster-hazelcast</service-name>
        </kubernetes>
    </network>
</hazelcast-client>

预期结果:

该服务能够连接到Pod,并在其说明中创建端点。

$ kubectl描述服务my-app-cluster-hazelcast

Name:              my-app-cluster-hazelcast
Namespace:         default
Labels:            <none>
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-app-cluster-hazelcast","namespace":"default"},"spec":{"ports...
Selector:          app=my-app
Type:              ClusterIP
IP:                10.244.28.132
Port:              hazelcast  5701/TCP
TargetPort:        5701/TCP
Endpoints:         10.244.4.10:5701,10.244.4.9:5701
Session Affinity:  None
Events:            <none>

该应用程序正常运行,并在其hazelcast群集中显示了两个成员,并且部署显示为就绪,可以完全访问该应用程序:

日志:

2019-08-26 23:07:36,614 TRACE [hz._hzInstance_1_dev.InvocationMonitorThread] (com.hazelcast.spi.impl.operationservice.impl.InvocationMonitor): [10.244.4.10]:5701 [dev] [3.11] Broadcasting operation control packets to: 2 members

$ kubectl获得部署

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
my-app-deployment   2/2     2            2           2m27s

实际结果:

该服务没有任何端点。

$ kubectl描述服务my-app-cluster-hazelcast

Name:              my-app-cluster-hazelcast
Namespace:         default
Labels:            <none>
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-app-cluster-hazelcast","namespace":"default"},"spec":{"ports...
Selector:          app=my-app
Type:              ClusterIP
IP:                10.244.28.132
Port:              hazelcast  5701/TCP
TargetPort:        5701/TCP
Endpoints:
Session Affinity:  None
Events:            <none>

应用程序陷入具有以下日志的hazelcast-client.xml中启用的连接策略的困扰,从而保持其自己的群集不进行任何通信,并且部署永远处于非就绪状态:

日志:

22:54:11.236 [hz.client_0.cluster-] WARN com.hazelcast.client.connection.ClientConnectionManager - hz.client_0 [dev] [3.11] Unable to get alive cluster connection, try in 57686 ms later, attempt 52 , cap retrytimeout millis 60000
22:55:02.036 [hz._hzInstance_1_dev.cached.thread-4] DEBUG com.hazelcast.internal.cluster.impl.MembershipManager - [10.244.4.8]:5701 [dev] [3.11] Sending member list to the non-master nodes:

Members {size:1, ver:1} [
        Member [10.244.4.8]:5701 - 6a4c7184-8003-4d24-8023-6087d68e9709 this
]

22:55:08.968 [hz.client_0.cluster-] WARN com.hazelcast.client.connection.ClientConnectionManager - hz.client_0 [dev] [3.11] Unable to get alive cluster connection, try in 51173 ms later, attempt 53 , cap retrytimeout millis 60000
22:56:00.184 [hz.client_0.cluster-] WARN com.hazelcast.client.connection.ClientConnectionManager - hz.client_0 [dev] [3.11] Unable to get alive cluster connection, try in 55583 ms later, attempt 54 , cap retrytimeout millis 60000

$ kubectl获得部署

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
my-app-deployment   0/2     2            0           45m

2 个答案:

答案 0 :(得分:1)

只需要澄清一下:

OP关于readiness probe所述:

  

kubelet使用就绪性探测器来了解何时Container准备开始接受流量。当Pod的所有容器都准备就绪时,即视为准备就绪。此信号的一种用法是控制将哪些Pod用作服务的后端。 未就绪的Pod将从服务负载平衡器中删除

答案 1 :(得分:0)

您在服务Yaml中拥有

spec:
  selector:
    app: my-app

但是在部署Yaml中,标签值不同

metadata:
  name: my-app-deployment
  labels:
    app: my-app-deployment

有什么理由吗?