我用Spark启动我的DataStax cassandra实例:
dse cassandra -k
然后我运行这个程序(来自Eclipse内部):
import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object Start {
def main(args: Array[String]): Unit = {
println("***** 1 *****")
val sparkConf = new SparkConf().setAppName("Start").setMaster("spark://127.0.0.1:7077")
println("***** 2 *****")
val sparkContext = new SparkContext(sparkConf)
println("***** 3 *****")
}
}
我得到以下输出
***** 1 *****
***** 2 *****
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/12/29 15:27:50 INFO SparkContext: Running Spark version 1.5.2
15/12/29 15:27:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/29 15:27:51 INFO SecurityManager: Changing view acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: Changing modify acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nayan); users with modify permissions: Set(nayan)
15/12/29 15:27:52 INFO Slf4jLogger: Slf4jLogger started
15/12/29 15:27:52 INFO Remoting: Starting remoting
15/12/29 15:27:53 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.0.1.88:55126]
15/12/29 15:27:53 INFO Utils: Successfully started service 'sparkDriver' on port 55126.
15/12/29 15:27:53 INFO SparkEnv: Registering MapOutputTracker
15/12/29 15:27:53 INFO SparkEnv: Registering BlockManagerMaster
15/12/29 15:27:53 INFO DiskBlockManager: Created local directory at /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/blockmgr-21a96671-c33e-498c-83a4-bb5c57edbbfb
15/12/29 15:27:53 INFO MemoryStore: MemoryStore started with capacity 983.1 MB
15/12/29 15:27:53 INFO HttpFileServer: HTTP File server directory is /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/spark-fce0a058-9264-4f2c-8220-c32d90f11bd8/httpd-2a0efcac-2426-49c5-982a-941cfbb48c88
15/12/29 15:27:53 INFO HttpServer: Starting HTTP Server
15/12/29 15:27:53 INFO Utils: Successfully started service 'HTTP file server' on port 55127.
15/12/29 15:27:53 INFO SparkEnv: Registering OutputCommitCoordinator
15/12/29 15:27:53 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/12/29 15:27:53 INFO SparkUI: Started SparkUI at http://10.0.1.88:4040
15/12/29 15:27:54 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/12/29 15:27:54 INFO AppClient$ClientEndpoint: Connecting to master spark://127.0.0.1:7077...
15/12/29 15:27:54 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster@127.0.0.1:7077] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
15/12/29 15:28:14 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@1f22aef0 rejected from java.util.concurrent.ThreadPoolExecutor@176cb4af[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0]
因此在创建spark上下文期间会发生一些事情。
当我查看$DSE_HOME/logs/spark
时,它是空的。不确定在哪里可以看。
答案 0 :(得分:2)
事实证明问题是火花库版本和Scala版本。 DataStax运行Spark 1.4.1和Scala 2.10.5,而我的eclipse项目使用的是1.5.2&分别为2.11.7。
请注意,Spark库和Scala似乎必须匹配。我尝试了其他组合,但只有在两者匹配时才有效。
答案 1 :(得分:0)
我对你发布的错误的这一部分非常熟悉:
WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://...
它可能有很多原因,几乎都与错误配置的IP有关。首先我会做零点323说的,然后这是我的两分钱:我最近通过使用IP地址而不是主机名解决了我自己的问题,我在一个简单的独立集群中使用的唯一配置是SPARK_MASTER_IP。
主人的$ SPARK_HOME / conf / spark-env.sh中的火花://your.ip.address.numbers:7077
您的SparkConf设置可以参考。
话虽如此,我不熟悉您的具体实现,但我在错误中注意到两次出现:
/私有的/ var /文件夹/ PD / 6rxlm2js10gg6xys5wm90qpm0000gn / T /
您是否在那里查看是否有日志目录?这是$ DSE_HOME指向的地方吗?或者连接到它创建webui的驱动程序:
INFO SparkUI:在http://10.0.1.88:4040
启动SparkUI
你应该在某处看到错误日志的链接。
有关IP与主机名的更多信息,this very old bug is marked as Resolved但我还没有弄清楚Resolved的意思,所以我只是倾向于IP地址。