Spark Master不支持DSE 4.7和OpsCenter 5.1.3

时间:2015-06-13 22:03:35

标签: cassandra apache-spark datastax datastax-enterprise

我最近从Datastax升级4.6.3 => 4.7,现在我无法运行Spark。问题似乎是Spark Master未正确配置。我使用OpsCenter 5.1.3,并启动了一个三节点Analytics集群。奇怪的是,节点最初的设置SPARK_ENABLED = 0,我必须手动将其设置为1。但是,现在没有正确配置spark master。在/var/log/cassandra/system.log中,我获得了很长的输出:

[SPARK-WORKER-INIT-0] 2015-06-13 21:59:54,027  SparkWorkerRunner.java:49 - Spark Master not ready at (no configured master)
INFO  [SPARK-WORKER-INIT-0] 2015-06-13 21:59:55,028  SparkWorkerRunner.java:49 - Spark Master not ready at (no configured master)
INFO  [SPARK-WORKER-INIT-0] 2015-06-13 21:59:56,028  SparkWorkerRunner.java:49 - Spark Master not ready at (no configured master)

我尝试运行dse spark,我收到以下错误:

java.io.IOException: Spark Master address cannot be retrieved. This really should not be happening with DSE 4.7+ unless your cluster is over 50% down or booted up in the last minute.
    at com.datastax.bdp.plugin.SparkPlugin.getMasterAddress(SparkPlugin.java:257)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
    at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
    at com.sun.jmx.mbeanserver.StandardMBe

My Analytics DC已经启动了几天,并且没有启动节点。这个问题在最近几天阻碍了开发,我正在考虑降级到DSE 4.6.3,这样我就可以再次运行我的spark工作了。任何帮助都表示赞赏。

更新:

我正在研究50%的分析节点需要启动才能启动spark master的情况。在dse启动时检查system.log之后,我注意到Gossip似乎仍然认为一些旧节点是集群的一部分,而DOWN。例如,

INFO  [GossipStage:1] 2015-06-14 03:18:05,587  Gossiper.java:968 - InetAddress /172.31.23.17 is now DOWN
INFO  [GossipStage:1] 2015-06-14 03:18:05,614  Gossiper.java:968 - InetAddress /172.31.16.58 is now DOWN
INFO  [GossipStage:1] 2015-06-14 03:18:05,647  Gossiper.java:968 - InetAddress /172.31.24.25 is now DOWN
INFO  [GossipStage:1] 2015-06-14 03:18:05,687  Gossiper.java:968 - InetAddress /172.31.24.147 is now DOWN

这些是我之前离线的节点。我已经清除了这些节点的system.peers表,但是Gossip似乎仍然承认它们是集群的一部分。虚拟存在这些节点会使集群超过50%。但是,清除八卦表需要完全关闭群集。

0 个答案:

没有答案