卡夫卡无法与Zookeeper连接,错误为“状态为CONNECTING时等待连接超时

时间:2018-09-03 08:35:01

标签: docker kubernetes apache-kafka apache-zookeeper google-kubernetes-engine

我试图在kubernetes容器中运行我的kafka和zookeeper。

这是我的zookeeper-service.yaml

apiVersion: v1
kind: Service
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: zookeeper-svc
  name: zookeeper-svc
spec:
  ports:
  - name: "2181"
    port: 2181
    targetPort: 2181
  selector:
    io.kompose.service: zookeeper
status:
  loadBalancer: {}

下面是zookeeper-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: zookeeper
  name: zookeeper
spec:
  replicas: 1
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        io.kompose.service: zookeeper
    spec:
      containers:
      - image: wurstmeister/zookeeper
        name: zookeeper
        ports:
        - containerPort: 2181
        resources: {}
      restartPolicy: Always
status: {}

kafka-deployment.yaml如下:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    kompose.cmd: kompose convert -f docker-compose.yml
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: kafka
  name: kafka
spec:
  replicas: 1
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        io.kompose.service: kafka
    spec:
      containers:
      - env:
        - name: KAFKA_ADVERTISED_HOST_NAME
          value: kafka
        - name: KAFKA_ZOOKEEPER_CONNECT
          value: zookeeper:2181
        - name: KAFKA_PORT
          value: "9092"
        - name: KAFKA_ZOOKEEPER_CONNECT_TIMEOUT_MS
          value: "60000"
        image: wurstmeister/kafka
        name: kafka
        ports:
        - containerPort: 9092
        resources: {}
      restartPolicy: Always
status: {}

我首先启动Zookeeper服务和部署。 Zookeeper启动后,kubectl get pods将其显示为运行状态后,便开始kafka部署。与往常一样,由于restartPolicy,Kafka部署开始失败并一次又一次重新启动。当我检查来自kafka docker的日志时,我发现它无法连接到zookeeper服务并且连接超时。这是来自kafka容器的日志。

[2018-09-03 07:06:06,670] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
atkafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$ waitUntilConnected$1.apply$mcV$sp(ZooKeeperClient.scala:230)
at kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
at kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
at kafka.zookeeper.ZooKeeperClient.kafka$zookeeper$ZooKeeperClient$$waitUntilConnected(ZooKeeperClient.scala:226)
at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:95)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1580)
at kafka.server.KafkaServer.kafka$server$KafkaServer$$createZkClient$1(KafkaServer.scala:348)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:372)
at kafka.server.KafkaServer.startup(KafkaServer.scala:202)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:75)
at kafka.Kafka.main(Kafka.scala)
[2018-09-03 07:06:06,671] INFO shutting down (kafka.server.KafkaServer)
[2018-09-03 07:06:06,673] WARN  (kafka.utils.CoreUtils$)
java.lang.NullPointerException
atkafka.server.KafkaServer$$anonfun$shutdown$5.apply$mcV$sp(KafkaServer.scala:579)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:86)
at kafka.server.KafkaServer.shutdown(KafkaServer.scala:579)
at kafka.server.KafkaServer.startup(KafkaServer.scala:329)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:75)
at kafka.Kafka.main(Kafka.scala)
[2018-09-03 07:06:06,676] INFO shut down completed 
(kafka.server.KafkaServer)
[2018-09-03 07:06:06,677] ERROR Exiting Kafka. 
(kafka.server.KafkaServerStartable)
[2018-09-03 07:06:06,678] INFO shutting down 
(kafka.server.KafkaServer)

这可能是什么原因?和解决方案?

编辑:Zookeeper容器中的日志:

2018-09-03 10:32:39,562 [myid:] - INFO  
[main:ZooKeeperServerMain@96] - Starting server
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:zookeeper.version=3.4.9-1757313, built on 
08/23/2016 06:50 GMT
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:host.name=zookeeper-7594d99b-sgm6p
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.version=1.7.0_65
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.vendor=Oracle Corporation
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.class.path=/opt/zookeeper- 
3.4.9/bin/../build/classes:/opt/zookeeper- 
3.4.9/bin/../build/lib/*.jar:/opt/zookeeper-3.4.9/bin/../lib/slf4j- 
log4j12-1.6.1.jar:/opt/zookeeper-3.4.9/bin/../lib/slf4j-api-1.6. 
1.ja r:/opt/zookeeper-3.4.9/bin/../lib/netty- 
3.10.5.Final.jar:/opt/zookeeper-3.4.9/bin/../lib/log4j- 
1.2.16.jar:/opt/zookeeper-3.4.9/bin/../lib/jline- 
0.9.94.jar:/opt/zookeeper-3.4.9/bin/../zookeeper- 
3.4.9.jar:/opt/zookeeper- 
3.4.9/bin/../src/java/lib/*.jar:/opt/zookeeper-3.4.9/bin/../conf:

2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.io.tmpdir=/tmp
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.compiler=<NA>
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:os.name=Linux
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:os.arch=amd64 
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:os.version=4.15.0-20-generic
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] -     
Server environment:user.name=root
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:user.home=/root
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:user.dir=/opt/zookeeper-3.4.9
2018-09-03 10:32:39,570 [myid:] - INFO  [main:ZooKeeperServer@815] 
- 
tickTime set to 2000
2018-09-03 10:32:39,571 [myid:] - INFO  [main:ZooKeeperServer@824] 
- 
minSessionTimeout set to -1
2018-09-03 10:32:39,571 [myid:] - INFO  [main:ZooKeeperServer@833] 
- 
maxSessionTimeout set to -1
2018-09-03 10:32:39,578 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] 
- binding to port 0.0.0.0/0.0.0.0:2181 

编辑: 从kafka容器启动日志:

Excluding KAFKA_HOME from broker config
[Configuring] 'advertised.host.name' in 
'/opt/kafka/config/server.properties'
[Configuring] 'port' in '/opt/kafka/config/server.properties'
[Configuring] 'broker.id' in '/opt/kafka/config/server.properties'
Excluding KAFKA_VERSION from broker config
[Configuring] 'zookeeper.connect' in 
'/opt/kafka/config/server.properties'
[Configuring] 'log.dirs' in '/opt/kafka/config/server.properties'
[Configuring] 'zookeeper.connect.timeout.ms' in 
'/opt/kafka/config/server.properties'
 [2018-09-05 10:47:22,036] INFO Registered 
kafka:type=kafka.Log4jController MBean 
(kafka.utils.Log4jControllerRegistration$) 
[2018-09-05 10:47:23,145] INFO starting (kafka.server.KafkaServer)
[2018-09-05 10:47:23,148] INFO Connecting to zookeeper on 
zookeeper:2181 (kafka.server.KafkaServer)
[2018-09-05 10:47:23,288] INFO [ZooKeeperClient] Initializing a new 
session to zookeeper:2181. (kafka.zookeeper.ZooKeeperClient)
[2018-09-05 10:47:23,300] INFO Client 
environment:zookeeper.version=3.4.13- 
2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 00:39 
GMT (org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,300] INFO Client environment:host.name=kafka 
-757dc6c47b-zpzfz (org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,300] INFO Client 
environment:java.version=1.8.0_171 (org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,301] INFO Client 
environment:java.vendor=Oracle Corporation 
(org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,301] INFO Client 
environment:java.home=/usr/lib/jvm/java-1.8-openjdk/jre 
(org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,301] INFO Client 
environment:java.class.path=/opt/kafka/bin/../libs/activation- 
1.1.1.jar:/opt/kafka/bin/../libs/aopalliance-repackaged-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/argparse4j- 
0.7.0.jar:/opt/kafka/bin/../libs/audience-annotations- 
0.5.0.jar:/opt/kafka/bin/../libs/commons-lang3- 
3.5.jar:/opt/kafka/bin/../libs/connect-api- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-basic-auth-extension- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-file- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-json- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-runtime- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-transforms- 
2.0.0.jar:/opt/kafka/bin/../libs/guava- 
20.0.jar:/opt/kafka/bin/../libs/hk2-api-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/hk2-locator-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/hk2-utils-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/jackson-annotations- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-core- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-databind- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-jaxrs-json-provider- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-module-jaxb-annotations- 
CR2.jar:/opt/kafka/bin/../libs/javax.annotation-api- 
1.2.jar:/opt/kafka/bin/../libs/javax.inject- 
1.jar:/opt/kafka/bin/../libs/javax.inject-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/javax.servlet-api- 
3.1.0.jar:/opt/kafka/bin/../libs/javax.ws.rs-api- 
2.1.jar:/opt/kafka/bin/../libs/jaxb-api- 
2.3.0.jar:/opt/kafka/bin/../libs/jersey-client- 
2.27.jar:/opt/kafka/bin/../libs/jersey-common- 
2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet 
-2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet-core- 
2.27.jar:/opt/kafka/bin/../libs/jersey-hk2- 
2.27.jar:/opt/kafka/bin/../libs/jersey-media-jaxb- 
2.27.jar:/opt/kafka/bin/../libs/jersey-server 
-2.27.jar:/opt/kafka/bin/../libs/jetty-client 
-9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-continuation- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-http- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-io- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-security- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-server- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-servlet- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-servlets- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-util- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jopt-simple- 
5.0.4.jar:/opt/kafka/bin/../libs/kafka-clients- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-log4j-appender- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams-examples- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams-scala_2.11- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams-test-utils- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-tools- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka_2.11-2.0.0- 
sources.jar:/opt/kafka/bin/../libs/kafka_2.11-2 
 .0.0.jar:/opt/kafka/bin/../libs/log4j 
1.2.17.jar:/opt/kafka/bin/../libs/lz4-java- 
1.4.1.jar:/opt/kafka/bin/../libs/maven-artifact- 
3.5.3.jar:/opt/kafka/bin/../libs/metrics-core- 
2.2.0.jar:/opt/kafka/bin/../libs/osgi-resource-locator- 
1.0.1.jar:/opt/kafka/bin/../libs/plexus-utils- 
3.1.0.jar:/opt/kafka/bin/../libs/reflections- 
0.9.11.jar:/opt/kafka/bin/../libs/rocksdbjni- 
5.7.3.jar:/opt/kafka/bin/../libs/scala-library- 
2.11.12.jar:/opt/kafka/bin/../libs/scala-logging_2.11- 
3.9.0.jar:/opt/kafka/bin/../libs/scala-reflect- 
2.11.12.jar:/opt/kafka/bin/../libs/slf4j-api- 
1.7.25.jar:/opt/kafka/bin/../libs/slf4j-log4j12- 
1.7.25.jar:/opt/kafka/bin/../libs/snappy-java- 
1.1.7.1.jar:/opt/kafka/bin/../libs/validation-api- 
1.1.0.Final.jar:/opt/kafka/bin/../libs/zkclient- 
0.10.jar:/opt/kafka/bin/../libs/zookeeper-3.4.13.jar 
(org.apache.zookeeper.ZooKeeper) 
kubectl get svc -o wide

输出如下:

NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE       SELECTOR
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP    50m       <none>
zookeeper    ClusterIP   10.98.180.138   <none>        2181/TCP   48m       io.kompose.service=zookeeper

kubectl get pods -o wide的输出:

NAME                       READY     STATUS             RESTARTS   AGE       IP           NODE
kafka-757dc6c47b-zpzfz     0/1       CrashLoopBackOff   15         1h        10.32.0.17   administrator-thinkpad-l480
zookeeper-7594d99b-784n9   1/1       Running            0          1h        10.32.0.19   administrator-thinkpad-l480

编辑: kubectl describe pod kafka-757dc6c47b-zpzfz的输出:

Name:           kafka-757dc6c47b-zpzfz
Namespace:      default
Node:           administrator-thinkpad-l480/10.11.17.86
Start Time:     Wed, 05 Sep 2018 16:17:06 +0530
Labels:         io.kompose.service=kafka
            pod-template-hash=3138727036
Annotations:    <none>
Status:         Running
IP:             10.32.0.17
Controlled By:  ReplicaSet/kafka-757dc6c47b
Containers:
  kafka:
   Container ID:docker://2bdc06d876ae23437c61f4e95539a67903cdb61e88fd9c68377b47c7705293a3
    Image:          wurstmeister/kafka
    Image ID:       docker-pullable://wurstmeister/kafka@sha256:2e3ff64e70ea983530f590282f36991c0a1b105350510f53cc3d1a0279b83c28
    Port:           9092/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 05 Sep 2018 17:29:06 +0530
      Finished:     Wed, 05 Sep 2018 17:29:14 +0530
    Ready:          False
    Restart Count:  18
    Environment:
      KAFKA_ADVERTISED_HOST_NAME:          kafka
      KAFKA_ZOOKEEPER_CONNECT:             zookeeper:2181
      KAFKA_PORT:                          9092
      KAFKA_ZOOKEEPER_CONNECT_TIMEOUT_MS:  160000
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-nhb9z (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-nhb9z:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-nhb9z
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
             node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason   Age                From                                  Message
  ----     ------   ----               ----                                  -------
  Warning  BackOff  3m (x293 over 1h)  kubelet, administrator-thinkpad-l480  Back-off restarting failed container

2 个答案:

答案 0 :(得分:1)

  

这可能是什么原因?和解决方案?

原因隐藏在以下日志行的后面:

INFO Connecting to zookeeper on zookeeper:2181 (kafka.server.KafkaServer)

Kafka正在搜索zookeeper,而它应该搜索您的服务名称zookeeper-svc

解决方案非常简单:将您的Zookeeper服务从zookeeper-svc中的zookeeper重命名为zookeeper-service.yaml,如下所示:

apiVersion: v1
kind: Service
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: zookeeper
  name: zookeeper
spec:
  ports:
  - name: "2181"
    port: 2181
    targetPort: 2181
  selector:
    io.kompose.service: zookeeper
status:
  loadBalancer: {}

或更干净一点,就是保持服务名称不变,然后将Kafka重新配置为搜索zookeeper-svc而不是zookeeper

注意:由于您的最小示例足以重现该问题(谢谢,很高兴看到最小示例!),尽管pod已启动并正在运行(未处于错误状态),仍然显示:java.io.IOException: Can't resolve address: kafka:9092与另一个问题(服务不涉及Kafka部署)有关,并且不在此问题的范围内,只是为了告诉您。

答案 1 :(得分:0)

我正在使用带有此警告的microk8s

  

警告:IPtables FORWARD策略为DROP

考虑通过以下方式启用流量转发:

sudo iptables -P FORWARD ACCEPT

修复它对我有用