Apache-storm Supervisor在尝试在多节点环境中运行拓扑时死亡

时间:2017-11-12 04:44:18

标签: submit apache-zookeeper apache-storm supervisor topology

我正在进行最简单的zookeeper / storm / topology / muti-node测试。

一切都很好,但主管在最后阶段(当风暴管理员试图运行拓扑时)死亡。

我有3个VM(ubuntu-16.04.2。)主机。

每个主机具有相同的环境(包括依赖项):

- zookeeper-3.4.10

- apache-storm-1.1.1

三个主机的名称:

storm-nimbus:这个id nimbus主机。

storm-sv-1:这是第一位主管。

storm-sv-2:这是第二位主管。

这三台主机具有/ etc / hosts的相同配置部分。

192.168.3.132 zk1.nf.dev st1.nf.dev
192.168.3.130 zk2.nf.dev st2.nf.dev
192.168.3.131 zk3.nf.dev st3.nf.dev

zoo.cfg

...
dataDir=/home/test/1/zookeeper/data/

server.1=zk1.nf.dev:2888:3888
server.2=zk2.nf.dev:2888:3888
server.3=zk3.nf.dev:2888:3888
...

storm.yaml

...
storm.zookeeper.servers:
     - "zk2.nf.dev"
     - "zk3.nf.dev"

nimbus.seeds: ["st1.nf.dev"]

storm.local.dir: "/home/test/1/storm-local"
...

---测试步骤---

1)在这三台主机上运行zookeeper服务器。    使用zkCli.sh测试zookeeper状态。 3个zookeeper节点都没问题。

2)在nimbus主机上运行storm ui(192.168.3.132)

3)在nimbus主机上运行storm nimbus(192.168.3.132)

4)在ui页面(http://192.168.3.132:8080/)中验证nimbus状态。没关系。

5)在nimbus主机(192.168.3.132)上提交wordcount拓扑

在ui页面中验证拓扑状态。没关系。

6)在主管主持人身上运行风暴监督员(192.168.3.130,192.168.3.131)。

7)在ui页面中验证管理程序和拓扑状态。

  • UI页面上显示两个主管。 :好的

  • 在每个主管页面中:

    • “Slots”和“Avali slots”不是0.:OK

    • “使用的插槽”始终为0.:这是问题。

8)经过2分钟后,主管去世了

[supervisor.log]

Caused by: java.net.UnknownHostException: storm-nimbus
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) ~[?:1.8.0_151]
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_151]
    at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_151]
    at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.1.jar:1.1.1]
    ... 13 more
2017-11-11 19:33:40.991 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Failed to download basic resources for topology-id WordCount-1-1510457397
2017-11-11 19:33:40.992 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /home/test/1/storm-local/supervisor/tmp/f645cbdf-c2d2-493f-917b-7d2e82e84ef5
2017-11-11 19:33:41.019 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /home/test/1/storm-local/supervisor/stormdist/WordCount-1-1510457397
2017-11-11 19:33:41.023 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Caught Exception While Downloading (rethrowing)... 
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [st1.nf.dev]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
    at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:111) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:57) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) ~[storm-core-1.1.1.jar:1.1.1]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_151]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
2017-11-11 19:33:41.027 o.a.s.d.s.Slot SLOT_6700 [ERROR] Error when processing event
java.util.concurrent.ExecutionException: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [st1.nf.dev]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
    at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_151]
    at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_151]
    at org.apache.storm.localizer.LocalDownloadedResource$NoCancelFuture.get(LocalDownloadedResource.java:63) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.daemon.supervisor.Slot.handleWaitingForBasicLocalization(Slot.java:413) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:273) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:741) ~[storm-core-1.1.1.jar:1.1.1]
Caused by: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [st1.nf.dev]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
    at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:111) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:57) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) ~[storm-core-1.1.1.jar:1.1.1]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_151]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
2017-11-11 19:33:41.027 o.a.s.u.Utils SLOT_6700 [ERROR] Halting process: Error when processing an event
java.lang.RuntimeException: Halting process: Error when processing an event
    at org.apache.storm.utils.Utils.exitProcess(Utils.java:1773) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:774) ~[storm-core-1.1.1.jar:1.1.1]
2017-11-11 19:33:41.032 o.a.s.d.s.Supervisor Thread-5 [INFO] Shutting down supervisor ce5768f3-787d-4e27-9bb0-857bb1015139
2017-11-11 19:33:41.036 o.a.s.e.EventManagerImp Thread-4 [INFO] Event manager interrupted

每次测试时我都清除了zookeeper数据和风暴临时数据。

如何解决以下错误消息:“无法从种子主机[st1.nf.dev]找到领导者nimbus。您是否为config nimbus.seeds指定了有效的nimbus主机列表?”

Ping到'st1.nf.dev'没问题。为什么主管找不到'st1.nf.dev'?

1 个答案:

答案 0 :(得分:0)

  1. 停止风暴

  2. 使用命令行

    连接到zookeeper
    ..path to zookeeper bin/zkCli.sh 
    rmr /storm 
    quit 
    
  3. 重启风暴