无法连接到Spark master

时间:2015-12-29 20:34:56

标签: scala apache-spark datastax

我用Spark启动我的DataStax cassandra实例:

dse cassandra -k

然后我运行这个程序(来自Eclipse内部):

import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

object Start {

  def main(args: Array[String]): Unit = {
    println("***** 1 *****")
    val sparkConf = new SparkConf().setAppName("Start").setMaster("spark://127.0.0.1:7077")
    println("***** 2 *****")
    val sparkContext = new SparkContext(sparkConf)
    println("***** 3 *****")
  }
}

我得到以下输出

***** 1 *****
***** 2 *****
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/12/29 15:27:50 INFO SparkContext: Running Spark version 1.5.2
15/12/29 15:27:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/29 15:27:51 INFO SecurityManager: Changing view acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: Changing modify acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nayan); users with modify permissions: Set(nayan)
15/12/29 15:27:52 INFO Slf4jLogger: Slf4jLogger started
15/12/29 15:27:52 INFO Remoting: Starting remoting
15/12/29 15:27:53 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.0.1.88:55126]
15/12/29 15:27:53 INFO Utils: Successfully started service 'sparkDriver' on port 55126.
15/12/29 15:27:53 INFO SparkEnv: Registering MapOutputTracker
15/12/29 15:27:53 INFO SparkEnv: Registering BlockManagerMaster
15/12/29 15:27:53 INFO DiskBlockManager: Created local directory at /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/blockmgr-21a96671-c33e-498c-83a4-bb5c57edbbfb
15/12/29 15:27:53 INFO MemoryStore: MemoryStore started with capacity 983.1 MB
15/12/29 15:27:53 INFO HttpFileServer: HTTP File server directory is /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/spark-fce0a058-9264-4f2c-8220-c32d90f11bd8/httpd-2a0efcac-2426-49c5-982a-941cfbb48c88
15/12/29 15:27:53 INFO HttpServer: Starting HTTP Server
15/12/29 15:27:53 INFO Utils: Successfully started service 'HTTP file server' on port 55127.
15/12/29 15:27:53 INFO SparkEnv: Registering OutputCommitCoordinator
15/12/29 15:27:53 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/12/29 15:27:53 INFO SparkUI: Started SparkUI at http://10.0.1.88:4040
15/12/29 15:27:54 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/12/29 15:27:54 INFO AppClient$ClientEndpoint: Connecting to master spark://127.0.0.1:7077...
15/12/29 15:27:54 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster@127.0.0.1:7077] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 
15/12/29 15:28:14 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@1f22aef0 rejected from java.util.concurrent.ThreadPoolExecutor@176cb4af[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0]

因此在创建spark上下文期间会发生一些事情。

当我查看$DSE_HOME/logs/spark时,它是空的。不确定在哪里可以看。

2 个答案:

答案 0 :(得分:2)

事实证明问题是火花库版本和Scala版本。 DataStax运行Spark 1.4.1和Scala 2.10.5,而我的eclipse项目使用的是1.5.2&分别为2.11.7。

请注意,Spark库和Scala似乎必须匹配。我尝试了其他组合,但只有在两者匹配时才有效。

答案 1 :(得分:0)

我对你发布的错误的这一部分非常熟悉:

WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://...

它可能有很多原因,几乎都与错误配置的IP有关。首先我会做零点323说的,然后这是我的两分钱:我最近通过使用IP地址而不是主机名解决了我自己的问题,我在一个简单的独立集群中使用的唯一配置是SPARK_MASTER_IP。

主人的$ SPARK_HOME / conf / spark-env.sh中的 SPARK_MASTER_IP应该引导主网站显示您设置的IP地址:

火花://your.ip.address.numbers:7077

您的SparkConf设置可以参考。

话虽如此,我不熟悉您的具体实现,但我在错误中注意到两次出现:

  

/私有的/ var /文件夹/ PD / 6rxlm2js10gg6xys5wm90qpm0000gn / T /

您是否在那里查看是否有日志目录?这是$ DSE_HOME指向的地方吗?或者连接到它创建webui的驱动程序:

  

INFO SparkUI:在http://10.0.1.88:4040

启动SparkUI

你应该在某处看到错误日志的链接。

有关IP与主机名的更多信息,this very old bug is marked as Resolved但我还没有弄清楚Resolved的意思,所以我只是倾向于IP地址。