Storm - Supervisors推出但没有连接到Nimbus

时间:2017-01-25 19:03:03

标签: apache-storm apache-zookeeper

我有一个带有1个Nimbus,4个Supervisor和2个Zookeeper节点的Storm集群。我的Storm.yaml如下:

storm.zookeeper.servers:
    - "storage14"
    - "storage15"

nimbus.seeds: ["storage01"]

#storm.local.hostname: "storage05"
supervisor.supervisors:
    - "storage02"
    - "storage03"
    - "storage04"
    - "storage05"

storm.local.dir: "/tmp/storm"

worker.childopts: "-Xmx%HEAP-MEM%m -XX:+PrintGCDetails -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=artifacts/heapdump"

这个storm.yaml文件由Nimbus和Supervisors使用。当Nimbus启动时,我将storm.local.hostname注释掉,如上所示。

但是,在各个节点上启动Supervisor时,我取消注释storm.local.hostname并将其设置为启动主管的节点的主机名。例如,如果我在storage05上启动supervisor,那么storm.yaml文件将具有以下额外的配置参数:

storm.local.hostname: "storage05"

问题是即使 Nimubs 成功启动,我可以在Storm UI上看到它,但某些主管似乎无法连接到雨云即可。例如,我开始主管的4个节点中,Storm UI通常只显示其中2个连接。但是,如果我ssh进入这些节点并运行jps,我可以看到主管进程正在所有这些节点上运行。

最终连接的节点上的主管总是不一样,所以这些特定节点肯定不是问题。

另一件需要注意的事情是,如果我尝试在连接的任何节点上执行拓扑,它就不会被集群注册,我也无法在UI上看到该拓扑。

您认为可能导致这种不稳定的行为?

更新 nimbus.log的尾部有以下几行

2017-01-25 00:04:25.216 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-01-25 00:04:25.317 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server storage15/192.168.140.195:2181. Will not attempt to authenticate using SASL (unknown error)
2017-01-25 00:04:25.317 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-01-25 00:04:25.686 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server storage15/192.168.140.195:2181. Will not attempt to authenticate using SASL (unknown error)
2017-01-25 00:04:25.686 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-01-25 00:04:25.787 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server storage14/192.168.140.194:2181. Will not attempt to authenticate using SASL (unknown error)
2017-01-25 00:04:25.787 o.a.s.s.o.a.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:701)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)

1 个答案:

答案 0 :(得分:0)

您的UPDATE(nimbus日志)表示您的Nimbus无法连接Zookeeper群集。请检查是否可以从storage01访问Zookeeper集群(storage14 / storage15)(不仅可以访问节点,还可以通过“telnet storage14(和/或storage15)2181”telnet到Zookeeper服务器)。

当ZK连接问题消失时,请再次尝试启动主管。