是什么导致Kubernetes Jenkins奴隶吊舱启动和暂停

时间:2020-08-16 08:31:26

标签: jenkins kubernetes

我正在使用Kubernetes Jenkins构建项目,但是有时Jenkins启动pod时,它显示正在启动.....然后被暂停。当我检查日志输出时显示404。

HTTP ERROR 404 Not Found
URI:    /computer/default-j07v7/log
STATUS: 404
MESSAGE:    Not Found
SERVLET:    Stapler
Powered by Jetty:// 9.4.27.v20200227

此错误看起来像:

Image1

当吊舱被挂起并要重新启动时,一次又一次。广告连播创建的事件看起来很正常:

Normal  Scheduled               default-scheduler   Successfully assigned infrastructure/default-v7m44 to k8sslave3
Normal  Pulled  1   2020-08-16T08:29:36Z    2020-08-16T08:29:36Z    kubelet Container image "jenkins/jnlp-slave:3.27-1" already present on machine
Normal  Created 1   2020-08-16T08:29:36Z    2020-08-16T08:29:36Z    kubelet Created container jnlp
Normal  Started 1   2020-08-16T08:29:36Z    2020-08-16T08:29:36Z    kubelet Started container jnlp

该如何解决此问题?尝试了几天,我发现如果我调整了pod templdate的任何参数,代理会立即变为暂停状态。如果默认保留它,则代理应正常启动。这是有线问题,让我感到困惑。这是我的jenkins主部署yaml:

kind: Deployment
apiVersion: apps/v1
metadata:
  name: jenkins
  namespace: infrastructure
  selfLink: /apis/apps/v1/namespaces/infrastructure/deployments/jenkins
  uid: 3df24fd6-ffaf-4f17-8b02-a2904cabbf95
  resourceVersion: '1707498'
  generation: 38
  creationTimestamp: '2020-07-18T14:48:47Z'
  labels:
    app.kubernetes.io/component: jenkins-master
    app.kubernetes.io/instance: jenkins
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: jenkins
    helm.sh/chart: jenkins-2.4.1
  annotations:
    deployment.kubernetes.io/revision: '10'
    meta.helm.sh/release-name: jenkins
    meta.helm.sh/release-namespace: infrastructure
  managedFields:
    - manager: Go-http-client
      operation: Update
      apiVersion: apps/v1
      time: '2020-08-02T10:08:04Z'
      fieldsType: FieldsV1
      
    - manager: dashboard
      operation: Update
      apiVersion: apps/v1
      time: '2020-08-17T14:27:59Z'
      fieldsType: FieldsV1
      fieldsV1:
        'f:spec':
          'f:template':
            'f:spec':
              'f:containers':
                'k:{"name":"jenkins"}':
                  'f:volumeMounts':
                    'k:{"mountPath":"/usr/bin/docker"}':
                      .: {}
                      'f:mountPath': {}
                      'f:name': {}
                    'k:{"mountPath":"/var/run/docker.sock"}':
                      .: {}
                      'f:mountPath': {}
                      'f:name': {}
              'f:securityContext':
                'f:runAsUser': {}
              'f:volumes':
                'k:{"name":"docker"}':
                  .: {}
                  'f:hostPath':
                    .: {}
                    'f:path': {}
                    'f:type': {}
                  'f:name': {}
                'k:{"name":"dockersock"}':
                  .: {}
                  'f:hostPath':
                    .: {}
                    'f:path': {}
                    'f:type': {}
                  'f:name': {}
    - manager: kube-controller-manager
      operation: Update
      apiVersion: apps/v1
      time: '2020-08-18T16:14:00Z'
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            'f:deployment.kubernetes.io/revision': {}
        'f:status':
          'f:availableReplicas': {}
          'f:conditions':
            .: {}
            'k:{"type":"Available"}':
              .: {}
              'f:lastTransitionTime': {}
              'f:lastUpdateTime': {}
              'f:message': {}
              'f:reason': {}
              'f:status': {}
              'f:type': {}
            'k:{"type":"Progressing"}':
              .: {}
              'f:lastTransitionTime': {}
              'f:lastUpdateTime': {}
              'f:message': {}
              'f:reason': {}
              'f:status': {}
              'f:type': {}
          'f:observedGeneration': {}
          'f:readyReplicas': {}
          'f:replicas': {}
          'f:updatedReplicas': {}
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: jenkins-master
      app.kubernetes.io/instance: jenkins
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/component: jenkins-master
        app.kubernetes.io/instance: jenkins
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/name: jenkins
        helm.sh/chart: jenkins-2.4.1
      annotations:
        checksum/config: 60990c68bb90ec59c79d56498da29d250d8da13cfbb9c35cad53f0cd789f318b
    spec:
      volumes:
        - name: plugins
          emptyDir: {}
        - name: tmp
          emptyDir: {}
        - name: jenkins-config
          configMap:
            name: jenkins
            defaultMode: 420
        - name: secrets-dir
          emptyDir: {}
        - name: plugin-dir
          emptyDir: {}
        - name: jenkins-home
          persistentVolumeClaim:
            claimName: jenkins
        - name: sc-config-volume
          emptyDir: {}
        - name: dockersock
          hostPath:
            path: /var/run/docker.sock
            type: ''
        - name: docker
          hostPath:
            path: /usr/bin/docker
            type: ''
      initContainers:
        - name: copy-default-config
          image: 'jenkins/jenkins:lts'
          command:
            - sh
            - /var/jenkins_config/apply_config.sh
          env:
            - name: ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: jenkins
                  key: jenkins-admin-password
            - name: ADMIN_USER
              valueFrom:
                secretKeyRef:
                  name: jenkins
                  key: jenkins-admin-user
          resources:
            limits:
              cpu: '2'
              memory: 4Gi
            requests:
              cpu: 50m
              memory: 256Mi
          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: jenkins-home
              mountPath: /var/jenkins_home
            - name: jenkins-config
              mountPath: /var/jenkins_config
            - name: secrets-dir
              mountPath: /usr/share/jenkins/ref/secrets/
            - name: plugins
              mountPath: /usr/share/jenkins/ref/plugins
            - name: plugin-dir
              mountPath: /var/jenkins_plugins
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: Always
      containers:
        - name: jenkins
          image: 'jenkins/jenkins:lts'
          args:
            - '--argumentsRealm.passwd.$(ADMIN_USER)=$(ADMIN_PASSWORD)'
            - '--argumentsRealm.roles.$(ADMIN_USER)=admin'
            - '--httpPort=8080'
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: slavelistener
              containerPort: 50000
              protocol: TCP
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: JAVA_OPTS
              value: |

                -Dcasc.reload.token=$(POD_NAME) 
            - name: JENKINS_OPTS
            - name: JENKINS_SLAVE_AGENT_PORT
              value: '50000'
            - name: ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: jenkins
                  key: jenkins-admin-password
            - name: ADMIN_USER
              valueFrom:
                secretKeyRef:
                  name: jenkins
                  key: jenkins-admin-user
            - name: CASC_JENKINS_CONFIG
              value: /var/jenkins_home/casc_configs
          resources:
            limits:
              cpu: '2'
              memory: 4Gi
            requests:
              cpu: 50m
              memory: 256Mi
          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: jenkins-home
              mountPath: /var/jenkins_home
            - name: jenkins-config
              readOnly: true
              mountPath: /var/jenkins_config
            - name: secrets-dir
              mountPath: /usr/share/jenkins/ref/secrets/
            - name: plugin-dir
              mountPath: /usr/share/jenkins/ref/plugins/
            - name: sc-config-volume
              mountPath: /var/jenkins_home/casc_configs
            - name: dockersock
              mountPath: /var/run/docker.sock
            - name: docker
              mountPath: /usr/bin/docker
          livenessProbe:
            httpGet:
              path: /login
              port: http
              scheme: HTTP
            initialDelaySeconds: 90
            timeoutSeconds: 5
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 5
          readinessProbe:
            httpGet:
              path: /login
              port: http
              scheme: HTTP
            initialDelaySeconds: 60
            timeoutSeconds: 5
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: Always
        - name: jenkins-sc-config
          image: 'kiwigrid/k8s-sidecar:0.1.144'
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: LABEL
              value: jenkins-jenkins-config
            - name: FOLDER
              value: /var/jenkins_home/casc_configs
            - name: NAMESPACE
              value: infrastructure
            - name: REQ_URL
              value: >-
                http://localhost:8080/reload-configuration-as-code/?casc-reload-token=$(POD_NAME)
            - name: REQ_METHOD
              value: POST
            - name: REQ_RETRY_CONNECT
              value: '10'
          resources: {}
          volumeMounts:
            - name: sc-config-volume
              mountPath: /var/jenkins_home/casc_configs
            - name: jenkins-home
              mountPath: /var/jenkins_home
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: jenkins
      serviceAccount: jenkins
      securityContext:
        runAsUser: 0
        fsGroup: 976
      schedulerName: default-scheduler
  strategy:
    type: Recreate
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600
status:
  observedGeneration: 38
  replicas: 1
  updatedReplicas: 1
  readyReplicas: 1
  availableReplicas: 1
  conditions:
    - type: Progressing
      status: 'True'
      lastUpdateTime: '2020-08-17T14:45:20Z'
      lastTransitionTime: '2020-08-17T14:45:20Z'
      reason: NewReplicaSetAvailable
      message: ReplicaSet "jenkins-7454db64f6" has successfully progressed.
    - type: Available
      status: 'True'
      lastUpdateTime: '2020-08-18T16:14:00Z'
      lastTransitionTime: '2020-08-18T16:14:00Z'
      reason: MinimumReplicasAvailable
      message: Deployment has minimum availability.

这是主容器中日志输出的一部分:

2020-08-21 16:44:40.381+0000 [id=955]   WARNING i.f.k.c.d.i.WatchConnectionManager$1#onFailure: Exec Failure
java.util.concurrent.RejectedExecutionException: Task okhttp3.RealCall$AsyncCall@2fb3e877 rejected from java.util.concurrent.ThreadPoolExecutor@9ce8b47[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 18]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
    at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
    at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:183)
Caused: java.io.InterruptedIOException: executor rejected
    at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:186)
    at okhttp3.Dispatcher.promoteAndExecute(Dispatcher.java:186)
    at okhttp3.Dispatcher.enqueue(Dispatcher.java:137)
    at okhttp3.RealCall.enqueue(RealCall.java:127)
    at okhttp3.internal.ws.RealWebSocket.connect(RealWebSocket.java:193)
    at okhttp3.OkHttpClient.newWebSocket(OkHttpClient.java:435)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.runWatch(WatchConnectionManager.java:158)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$1200(WatchConnectionManager.java:50)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2$1.execute(WatchConnectionManager.java:321)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$NamedRunnable.run(WatchConnectionManager.java:410)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:44:45.239+0000 [id=33]    INFO    hudson.slaves.NodeProvisioner#lambda$update$6: default-3393d provisioning successfully completed. We have now 3 computer(s)
2020-08-21 16:44:45.241+0000 [id=2765]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Created Pod: infrastructure/default-3393d
2020-08-21 16:44:45.302+0000 [id=2826]  INFO    o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:44:45.350+0000 [id=2765]  INFO    o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:44:55.363+0000 [id=2765]  WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-3393d, template=PodTemplate{inheritFrom='', name='default', namespace='', hostNetwork=false, activeDeadlineSeconds=10, label='jenkins-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins', command='/bin/sh -c', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceLimitCpu='512m', resourceLimitMemory='512Mi', envVars=[ContainerEnvVar [getValue()=http://jenkins.infrastructure.svc.cluster.local:8080, getKey()=JENKINS_URL]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@5187faf3}]}
java.lang.IllegalStateException: Pod has terminated containers: infrastructure/default-3393d (jnlp)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:133)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:154)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:94)
    at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:140)
    at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:296)
    at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
    at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:44:55.363+0000 [id=2765]  INFO    o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-3393d
Terminated Kubernetes instance for agent infrastructure/default-3393d
Disconnected computer default-3393d
2020-08-21 16:44:55.383+0000 [id=2765]  INFO    o.c.j.p.k.KubernetesSlave#deleteSlavePod: Terminated Kubernetes instance for agent infrastructure/default-3393d
2020-08-21 16:44:55.383+0000 [id=2765]  INFO    o.c.j.p.k.KubernetesSlave#_terminate: Disconnected computer default-3393d
2020-08-21 16:45:05.198+0000 [id=42]    INFO    o.c.j.p.k.KubernetesCloud#provision: Excess workload after pending Kubernetes agents: 1
2020-08-21 16:45:05.198+0000 [id=42]    INFO    o.c.j.p.k.KubernetesCloud#provision: Template for label null: default
2020-08-21 16:45:12.383+0000 [id=955]   WARNING i.f.k.c.d.i.WatchConnectionManager$1#onFailure: Exec Failure
java.util.concurrent.RejectedExecutionException: Task okhttp3.RealCall$AsyncCall@6c6c7a45 rejected from java.util.concurrent.ThreadPoolExecutor@9ce8b47[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 18]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
    at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
    at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
    at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:183)
Caused: java.io.InterruptedIOException: executor rejected
    at okhttp3.RealCall$AsyncCall.executeOn(RealCall.java:186)
    at okhttp3.Dispatcher.promoteAndExecute(Dispatcher.java:186)
    at okhttp3.Dispatcher.enqueue(Dispatcher.java:137)
    at okhttp3.RealCall.enqueue(RealCall.java:127)
    at okhttp3.internal.ws.RealWebSocket.connect(RealWebSocket.java:193)
    at okhttp3.OkHttpClient.newWebSocket(OkHttpClient.java:435)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.runWatch(WatchConnectionManager.java:158)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$1200(WatchConnectionManager.java:50)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2$1.execute(WatchConnectionManager.java:321)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$NamedRunnable.run(WatchConnectionManager.java:410)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:45:15.236+0000 [id=2765]  INFO    o.c.j.p.k.KubernetesLauncher#launch: Created Pod: infrastructure/default-03q6x
2020-08-21 16:45:15.252+0000 [id=36]    INFO    hudson.slaves.NodeProvisioner#lambda$update$6: default-03q6x provisioning successfully completed. We have now 3 computer(s)
2020-08-21 16:45:15.314+0000 [id=2824]  INFO    o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:45:15.381+0000 [id=2765]  INFO    o.internal.platform.Platform#log: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
2020-08-21 16:45:25.390+0000 [id=2765]  WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: default-03q6x, template=PodTemplate{inheritFrom='', name='default', namespace='', hostNetwork=false, activeDeadlineSeconds=10, label='jenkins-jenkins-slave ', serviceAccount='default', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], containers=[ContainerTemplate{name='jnlp', image='jenkins/jnlp-slave:3.27-1', workingDir='/home/jenkins', command='/bin/sh -c', args='${computer.jnlpmac} ${computer.name}', resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceLimitCpu='512m', resourceLimitMemory='512Mi', envVars=[ContainerEnvVar [getValue()=http://jenkins.infrastructure.svc.cluster.local:8080, getKey()=JENKINS_URL]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@5187faf3}]}
java.lang.IllegalStateException: Pod has terminated containers: infrastructure/default-03q6x (jnlp)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:133)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.periodicAwait(AllContainersRunningPodWatcher.java:154)
    at org.csanchez.jenkins.plugins.kubernetes.AllContainersRunningPodWatcher.await(AllContainersRunningPodWatcher.java:94)
    at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:140)
    at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:296)
    at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
    at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2020-08-21 16:45:25.391+0000 [id=2765]  INFO    o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent default-03q6x
Terminated Kubernetes instance for agent infrastructure/default-03q6x

现在这是我的kubernetes云模板快照:

enter image description here

这是Pod模板配置:

enter image description here

1 个答案:

答案 0 :(得分:2)

我建议这样做的改动很小

  1. jenkins tunnel的所有内容保留为空白。詹金斯会自动将其捡起。

  2. 如果您在kubernetes集群中部署了这个jenkins实例,那么请像jenkins_url那样使用http://jenkins.infrastructure.svc的内部地址,我假设您的jenkins服务名称是jenkins并且它是{{1 }}

相关问题