提交拓扑后的异常

时间:2015-09-16 15:26:35

标签: apache-storm apache-zookeeper

我是暴风雨中的新手,并尝试提交拓扑并找到了这个 在主管enter image description here enter image description here 我在工人的日志文件中找到了这个

 [ERROR] Async loop died!
java.lang.RuntimeException: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused
    at backtype.storm.drpc.DRPCInvocationsClient.<init>(DRPCInvocationsClient.java:23)
    at backtype.storm.drpc.DRPCSpout.open(DRPCSpout.java:69)
    at storm.trident.spout.RichSpoutBatchTriggerer.open(RichSpoutBatchTriggerer.java:41)
    at backtype.storm.daemon.executor$fn__3985$fn__3997.invoke(executor.clj:460)
    at backtype.storm.util$async_loop$fn__465.invoke(util.clj:375)
    at clojure.lang.AFn.run(AFn.java:24)
    at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused

主管的日志文件

supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:54 supervisor [ERROR] Error when processing event
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
    at com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:72)
    at com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:74)
    at com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:353)
    at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:149)
    at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:138)
    at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:85)
    at com.netflix.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:134)
    at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:125)
    at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:34)
    at backtype.storm.zookeeper$exists_node_QMARK_.invoke(zookeeper.clj:78)
    at backtype.storm.zookeeper$mkdirs.invoke(zookeeper.clj:88)
    at backtype.storm.cluster$mk_distributed_cluster_state$reify__1996.set_ephemeral_node(cluster.clj:54)
    at backtype.storm.cluster$mk_storm_cluster_state$reify__2415.supervisor_heartbeat_BANG_(cluster.clj:300)
    at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
    at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)

,这也是在主管日志文件中

   at java.lang.Thread.run(Unknown Source)
2015-09-15 02:00:54 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:55 ClientCnxn [INFO] Client session timed out, have not heard from server in 20020ms for sessionid 0x14fce3996380015, closing socket connection and attempting reconnect
2015-09-15 02:00:58 ClientCnxn [INFO] Opening socket connection to server localhost/127.0.0.1:2181
2015-09-15 02:00:58 ClientCnxn [INFO] Socket connection established to localhost/127.0.0.1:2181, initiating session
2015-09-15 02:00:59 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:01:01 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:59 util [INFO] Halting process: ("Error when processing an event")

1 个答案:

答案 0 :(得分:0)

这个问题有很多可能的原因。

  1. zookeeper未启动。
  2. CPU达到峰值一段时间,没有心跳发送超时,所以nimbus认为主管已经死了,断开连接。
  3. 工作者超时太短,可能默认为10秒,您可以将其更改为600或更多以尝试。它几乎像#2。
  4. 确保灵气正常工作。
  5. worker.childopts不正确,表示内存设置不正确,请更改xmx和maxpermsize再试一次。
  6. 如果你用winrm或powershell启动风暴,可能默认内存不够,因为默认内存只有1024M,你需要设置更多,比如2048M来试试。