Question

问题

我有一个Redis K8s部署，该部署链接到单独的服务，清单明显减少，如下所示（如果需要更多信息，请告知我）

apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: cache
      environment: dev
  template:
    metadata:
      labels:
        app: cache
        environment: dev
    spec:
      containers:
        - name: cache
          image: marketplace.gcr.io/google/redis5
          imagePullPolicy: IfNotPresent
          livenessProbe:
            exec:
              command:
              - redis-cli
              - ping
            initialDelaySeconds: 30
            timeoutSeconds: 5
          readinessProbe:
            exec:
              command:
              - redis-cli
              - ping
            initialDelaySeconds: 30
            timeoutSeconds: 5
      volumes:
        - name: data
          nfs:
            server: "nfs-server.recs-api.svc.cluster.local"
            path: "/data"

我想用新的数据集定期重新部署Redis，而不是更新现有的缓存。进行kubectl rollout restart deployment/cache时，旧的Redis容器在新的Redis容器准备接受流量之前被终止。这些新的Redis容器标记为READY，并且可以预见的是旧的已终止，但是新的Redis容器上的redis-cli ping返回(error) LOADING Redis is loading the dataset in memory。 Redis目前需要5到10分钟的时间才能停止加载数据集并准备接受连接，但是到现在为止，它们已经准备就绪了相同的时间，并且由于旧Pod已被终止，因此指向它们的流量一直很高。

我的怀疑是，因为此响应的状态代码为0，所以readinessProbe触发READY 1/1并杀死了旧的Pod，但是我一直找不到合适的{{1 }}可以避免此问题。

exec: command:有一条redis-cli info行，因此我进行了测试：

loading:0|1

希望对于非0的加载值，grep将提供一个非零的状态代码并会使readinessProbe失败，但这似乎不起作用，并且与readinessProbe: exec: command: ["redis-cli", "info", "|", "grep loading:", "|", "grep 0"]的行为相同，但过早终止吊舱和服务中断，直到加载完成。

我想要什么

在部署新的Redis缓存容器时，我希望有一个容器准备好接受整个连接，而新的Redis缓存容器正在将数据集加载到内存中
- 理想情况下，是准备就绪的探针检查形式，但完全可以接受任何建议！
- 我也可能误解了准备就绪探针的目的，所以请让我知道
如果可能的话，请更好地理解，尽管redis-cli ping上的状态代码为非零，redis-cli ping或其他readinessProbes为何仍为新Pod触发READY状态
谢谢！

Answer 1

我研究了bitnami / redis图表，并发现它们如何实现活跃性/就绪性探测。

从他们的图表中，他们创建了一个health-configmap，其中包含使用redis-cliping来运行状况检查redis服务器并处理响应的shell脚本。

这是定义的配置映射：

data:
  ping_readiness_local.sh: |-
    #!/bin/bash
{{- if .Values.usePasswordFile }}
    password_aux=`cat ${REDIS_PASSWORD_FILE}`
    export REDIS_PASSWORD=$password_aux
{{- end }}
{{- if .Values.usePassword }}
    no_auth_warning=$([[ "$(redis-cli --version)" =~ (redis-cli 5.*) ]] && echo --no-auth-warning)
{{- end }}
    response=$(
      timeout -s 3 $1 \
      redis-cli \
{{- if .Values.usePassword }}
        -a $REDIS_PASSWORD $no_auth_warning \
{{- end }}
        -h localhost \
{{- if .Values.tls.enabled }}
        -p $REDIS_TLS_PORT \
        --tls \
        --cacert {{ template "redis.tlsCACert" . }} \
        {{- if .Values.tls.authClients }}
          --cert {{ template "redis.tlsCert" . }} \
          --key {{ template "redis.tlsCertKey" . }} \
        {{- end }}
{{- else }}
        -p $REDIS_PORT \
{{- end }}
        ping
    )
    if [ "$response" != "PONG" ]; then
      echo "$response"
      exit 1
    fi

在部署/状态集中，只需设置探针以执行此shell脚本即可：

readinessProbe:
    initialDelaySeconds: {{ .Values.redis.readinessProbe.initialDelaySeconds }}
    periodSeconds: {{ .Values.redis.readinessProbe.periodSeconds }}
    timeoutSeconds: {{ .Values.redis.readinessProbe.timeoutSeconds }}
    successThreshold: {{ .Values.redis.readinessProbe.successThreshold }}
    failureThreshold: {{ .Values.redis.readinessProbe.failureThreshold }}
    exec:
      command:
        - sh
        - -c
        - /scripts/ping_readiness_local.sh {{ .Values.redis.readinessProbe.timeoutSeconds }}

Answer 2

以下应该可以正常工作

关键是

tcpSocket:
        port: client # named port

整个片段

       - name: redis
         image: ${DOCKER_PATH_AND_IMAGE}
         resources:
           limits:
             memory: "1.5Gi"
           requests:
             memory: "1.5Gi"
         ports:
         - name: client
           containerPort: 6379
         - name: gossip
           containerPort: 16379
         command: ["/conf/update-node.sh", "redis-server", "/conf/redis.conf"]
         livenessProbe:
          tcpSocket:
            port: client # named port
          initialDelaySeconds: 30
          timeoutSeconds: 5
          periodSeconds: 5
          failureThreshold: 5
          successThreshold: 1
         readinessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 20
          timeoutSeconds: 5
          periodSeconds: 3

具有大量数据集的Redis就绪探针

问题

我想要什么

2 个答案: