我正在尝试使用Scala连接Spark和Cassandra,如此处所述 http://www.planetcassandra.org/blog/kindling-an-introduction-to-spark-with-cassandra/ 我在标题下的步骤中遇到错误:
"将连接器加载到Spark Shell中:"
val test_spark_rdd = sc.cassandraTable(“test_spark”,“test”)
test_spark_rdd.first 使用上面的命令(粗体)
它显示错误 阶段0.0(TID 0)的任务0.0中的异常java.lang.NullPointerException
我已经在这里上传了完整的堆栈跟踪
https://docs.google.com/document/d/1UjGXKifD6chq7-WrHd3GT3LoNcw8GawxAPeOtiEjKvM/edit?usp=sharing
cassandra.YAML文件中的一些rpc设置是:
rpc_address: localhost
# rpc_interface: eth1
# rpc_interface_prefer_ipv6: false
# port for Thrift to listen for clients on
rpc_port: 9160
我的spark-defaults配置文件
# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.
# Example:
# spark.master spark://master:7077
# spark.eventLog.enabled true
# spark.eventLog.dir hdfs://namenode:8021/directory
#spark.serializer org.apache.spark.serializer.KryoSerializer
#spark.driver.memory 5g
#spark.executor.extraJavaOptions -XX:+PrintGCDetails -#Dkey=value -Dnumbers="one two three"
spark.cassandra.connection.host localhost
答案 0 :(得分:0)
15/08/04 21:24:50 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(Unknown Source)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
看起来问题是底层分叉执行程序进程无法启动或对本地文件系统执行某些操作。确保Executor Process可以访问默认的spark目录。