如何在Windows中使用Scala将Cassandra与Spark连接起来

时间:2015-08-04 17:23:38

标签: windows scala cassandra apache-spark

我正在尝试使用Scala连接Spark和Cassandra,如此处所述 http://www.planetcassandra.org/blog/kindling-an-introduction-to-spark-with-cassandra/ 我在标题下的步骤中遇到错误:

"将连接器加载到Spark Shell中:"

val test_spark_rdd = sc.cassandraTable(“test_spark”,“test”)

test_spark_rdd.first 使用上面的命令(粗体)

它显示错误 阶段0.0(TID 0)的任务0.0中的异常java.lang.NullPointerException

我已经在这里上传了完整的堆栈跟踪

https://docs.google.com/document/d/1UjGXKifD6chq7-WrHd3GT3LoNcw8GawxAPeOtiEjKvM/edit?usp=sharing

cassandra.YAML文件中的一些rpc设置是:

rpc_address: localhost 
# rpc_interface: eth1 
# rpc_interface_prefer_ipv6: false 
# port for Thrift to listen for clients on 
rpc_port: 9160 

我的spark-defaults配置文件

# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.

# Example:
# spark.master                     spark://master:7077
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
#spark.serializer                 org.apache.spark.serializer.KryoSerializer
#spark.driver.memory              5g
#spark.executor.extraJavaOptions  -XX:+PrintGCDetails -#Dkey=value -Dnumbers="one two three"
spark.cassandra.connection.host localhost

1 个答案:

答案 0 :(得分:0)

15/08/04 21:24:50 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
    java.lang.NullPointerException
            at java.lang.ProcessBuilder.start(Unknown Source)
            at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
            at org.apache.hadoop.util.Shell.run(Shell.java:418)
            at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
            at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
            at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)

看起来问题是底层分叉执行程序进程无法启动或对本地文件系统执行某些操作。确保Executor Process可以访问默认的spark目录。