为什么我不能将DSE Spark应用程序提交到群集?我可以在本地计算机上运行它,但不能在集群中运行它。
这是运行命令,每次运行它都会告诉我无法连接到akka而我不知道原因:
dse spark-submit --master spark://localhost:7077 --executor-memory 10G --total-executor-cores 4 --driver-memory 1G --packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1 --jars /root/spark-streaming-kafka_2.10-1.4.1.jar /root/pythonspark/com/spark/toutiaospark.py appname source
这是错误消息:
Ivy Default Cache set to: /root/.ivy2/cache
The jars for the packages stored in: /root/.ivy2/jars
:: loading settings :: url = jar:file:/usr/share/dse/spark/lib/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-streaming-kafka_2.10 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found org.apache.spark#spark-streaming-kafka_2.10;1.4.1 in central
found org.apache.kafka#kafka_2.10;0.8.2.1 in central
found com.yammer.metrics#metrics-core;2.2.0 in central
found org.slf4j#slf4j-api;1.7.10 in central
found org.apache.kafka#kafka-clients;0.8.2.1 in central
found net.jpountz.lz4#lz4;1.2.0 in central
found org.xerial.snappy#snappy-java;1.1.1.7 in central
found com.101tec#zkclient;0.3 in central
found log4j#log4j;1.2.17 in central
found org.spark-project.spark#unused;1.0.0 in central
:: resolution report :: resolve 469ms :: artifacts dl 14ms
:: modules in use:
com.101tec#zkclient;0.3 from central in [default]
com.yammer.metrics#metrics-core;2.2.0 from central in [default]
log4j#log4j;1.2.17 from central in [default]
net.jpountz.lz4#lz4;1.2.0 from central in [default]
org.apache.kafka#kafka-clients;0.8.2.1 from central in [default]
org.apache.kafka#kafka_2.10;0.8.2.1 from central in [default]
org.apache.spark#spark-streaming-kafka_2.10;1.4.1 from central in [default]
org.slf4j#slf4j-api;1.7.10 from central in [default]
org.spark-project.spark#unused;1.0.0 from central in [default]
org.xerial.snappy#snappy-java;1.1.1.7 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 10 | 0 | 0 | 0 || 10 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 10 already retrieved (0kB/12ms)
WARN 2016-02-29 12:38:48 org.apache.spark.deploy.client.AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@localhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@localhost:7077
WARN 2016-02-29 12:38:48 Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@localhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /localhost:7077
WARN 2016-02-29 12:39:08 org.apache.spark.deploy.client.AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@localhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@localhost:7077
WARN 2016-02-29 12:39:08 Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@localhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /localhost:7077
WARN 2016-02-29 12:39:28 org.apache.spark.deploy.client.AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@localhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@localhost:7077
WARN 2016-02-29 12:39:28 Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@localhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /localhost:7077
ERROR 2016-02-29 12:39:48 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
WARN 2016-02-29 12:39:48 org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend: Application ID is not initialized yet.
WARN 2016-02-29 12:39:48 org.apache.spark.deploy.client.AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@localhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@localhost:7077
ERROR 2016-02-29 12:39:48 akka.actor.OneForOneStrategy: null
java.lang.NullPointerException: null
at org.apache.spark.deploy.client.AppClient$ClientActor$$anonfun$receiveWithLogging$1.applyOrElse(AppClient.scala:160) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) ~[scala-library-2.10.5.jar:na]
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) ~[scala-library-2.10.5.jar:na]
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) ~[scala-library-2.10.5.jar:na]
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) ~[scala-library-2.10.5.jar:na]
at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.10-2.3.4-spark.jar:na]
at org.apache.spark.deploy.client.AppClient$ClientActor.aroundReceive(AppClient.scala:61) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) [akka-actor_2.10-2.3.4-spark.jar:na]
at akka.actor.ActorCell.invoke(ActorCell.scala:487) [akka-actor_2.10-2.3.4-spark.jar:na]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) [akka-actor_2.10-2.3.4-spark.jar:na]
at akka.dispatch.Mailbox.run(Mailbox.scala:220) [akka-actor_2.10-2.3.4-spark.jar:na]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [akka-actor_2.10-2.3.4-spark.jar:na]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.10.5.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.10.5.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.10.5.jar:na]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.10.5.jar:na]
WARN 2016-02-29 12:39:48 Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@localhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /localhost:7077
ERROR 2016-02-29 12:39:48 org.apache.spark.SparkContext: Error initializing SparkContext.
java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1504) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2032) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at org.apache.spark.SparkContext.<init>(SparkContext.scala:543) ~[spark-core_2.10-1.4.2.2.jar:1.4.2.2]
at com.datastax.bdp.spark.DseSparkContext$.apply(DseSparkContext.scala:42) [dse-spark-4.8.4.jar:4.8.4]
at com.datastax.bdp.spark.DseSparkContext.apply(DseSparkContext.scala..
答案 0 :(得分:1)
重要信息是
WARN 2016-02-29 12:39:48 Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@localhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: /localhost:7077
ERROR 2016-02-29 12:39:48 org.apache.spark.SparkContext: Error initializing SparkContext.
这是为了让您知道您的Spark Master在localhost无法访问(可能是因为主服务器已绑定到此计算机)。默认情况下,Spark Master绑定到C *的监听地址。最简单的解决方案是不在启动脚本中指定--master
。 DSE
会自动为您设置Spark Master。
dse spark-submit --executor-memory 10G --total-executor-cores 4 --driver-memory 1G --packages org.apache.spark:spark-streaming-kafka_2.10:1.4.1 /root/pythonspark/com/spark/toutiaospark.py appname source