我在具有docker swarm的基础架构中工作,其中每个节点有一个(唯一)kafka,一切正常,没有任何错误/警告日志,但是当我尝试访问第二个代理(worker2
)时,我捕获错误。看一下输出:
kafkacat -L -b worker2:9094
% ERROR: Failed to acquire metadata: Local: Timed out
worker1
的预期输出为:
kafkacat -L -b worker1:9094
Metadata for all topics (from broker 1001: worker1:9094/1001):
2 brokers:
broker 1001 at worker1:9094
broker 1002 at worker2:9094
1 topics:
topic "logging_application_access" with 1 partitions:
partition 0, leader 1001, replicas: 1001, isrs: 1001
我的列表节点的输出是:
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
c6msb82zav2p1lepd13phmeei * manager1 Ready Active Leader 18.06.1-ce
n3sfqz4rgewtulz43q5qobmr1 worker1 Ready Active 18.06.1-ce
xgkibsp0kx29bhmjkwysapa6h worker2 Ready Active 18.06.1-ce
为更好地理解,请查看我的docker-compose.yml
文件:
version: '3.6'
x-proxy: &proxy
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
no_proxy: ${no_proxy}
services:
zookeeper:
image: zookeeper:3.4.13
hostname: zookeeper
volumes:
- type: volume
source: zookeeper-data
target: /data
environment:
<<: *proxy
ports:
- target: 2181
published: 2181
protocol: tcp
networks:
- workshop
restart: always
deploy:
replicas: 2
placement:
constraints:
- node.role == worker
kafka:
image: wurstmeister/kafka:2.12-2.1.0
hostname: kafka
volumes:
- type: volume
source: kafka-data
target: /kafka
- type: bind
source: /var/run/docker.sock
target: /var/run/docker.sock
read_only: true
env_file: ./services/kafka/.env
environment:
<<: *proxy
ports:
- target: 9094
published: 9094
protocol: tcp
mode: host
networks:
- workshop
restart: always
depends_on:
- zookeeper
deploy:
mode: global
placement:
constraints:
- node.role == worker
volumes:
zookeeper-data:
driver: local
kafka-data:
driver: local
networks:
workshop:
name: workshop
external: true
最后是环境文件:
HOSTNAME_COMMAND=docker info | grep ^Name: | cut -d ' ' -f 2
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS=INSIDE://:9092,OUTSIDE://_{HOSTNAME_COMMAND}:9094
KAFKA_LISTENERS=INSIDE://:9092,OUTSIDE://:9094
KAFKA_INTER_BROKER_LISTENER_NAME=INSIDE
KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
KAFKA_ZOOKEEPER_CONNECTION_TIMEOUT_MS=36000
KAFKA_ZOOKEEPER_SESSION_TIMEOUT_MS=36000
KAFKA_LOG_RETENTION_HOURS=24
KAFKA_AUTO_CREATE_TOPICS_ENABLE=false
KAFKA_CREATE_TOPICS=logging_application_access:1:1
KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=127.0.0.1 -Dcom.sun.management.jmxremote.rmi.port=1099
JMX_PORT=1099
我正在为此寻求解决方案,
答案 0 :(得分:1)
如果将Zookeeper缩小到一个实例,它会开始工作吗?以我的经验,如果要在高可用性模式下运行zookeeper,则需要在zookeeper连接字符串中明确列出整个仲裁,这不适用于docker中的复制服务。因此,要么只运行一个Zookeeper节点,要么在撰写文件中为每个节点(即zookeeper1,zookeeper2,zookeeper3)提供单独的服务,并将它们全部列在zookeeper连接变量中,即KAFKA_ZOOKEEPER_CONNECT = zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
您可以尝试使用task.zookeeper dnsrr地址,但是我的经验是无法正确解析该服务背后的容器列表。
仅供参考,运行两个Zookeeper节点不会给您带来任何好处; Zookeeper需要仲裁中超过一半的节点,因此您至少需要三个节点才能具有容错能力。