远程连接到Spark群集

时间:2016-11-30 17:47:26

标签: java apache-spark bigdata

我正在尝试使用Java程序从本地系统连接到spark主节点(远程集群节点)。我正在使用以下API进行连接:

    SparkConf conf = new SparkConf().setAppName("WorkCountApp").setMaster("spark://masterIP:7077");
 JavaSparkContext sc = new JavaSparkContext(conf);

我的程序尝试连接到主程序但在一段时间后失败。下面是stacktrace:

  16/11/30 17:40:26 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@ec2-54-202-212-141.us-west-2.compute.amazonaws.com:7077/user/Master...
    16/11/30 17:40:46 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
    16/11/30 17:40:46 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
    16/11/30 17:40:46 INFO SparkUI: Stopped Spark web UI at http://172.31.11.1:4040
    16/11/30 17:40:46 INFO DAGScheduler: Stopping DAGScheduler
    16/11/30 17:40:46 INFO SparkDeploySchedulerBackend: Shutting down all executors
    16/11/30 17:40:46 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
    16/11/30 17:40:46 ERROR OneForOneStrategy: 
    java.lang.NullPointerException
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
    at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)

请帮我一样

1 个答案:

答案 0 :(得分:0)

连接失败的原因有很多。但是,为此,看起来没有为此Spark master实例化的工作线程。

在远程机器上,您需要启动spark master和spark worker(slave)