SparkR无法将数据加载到R

时间:2015-10-07 21:10:29

标签: r apache-spark sparkr

我从其他帖子(例如this one)执行完全相同的步骤,以在R中创建火花数据框。

Sys.setenv(SPARK_HOME = "E:/spark-1.5.0-bin-hadoop2.6")
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.2.0" "sparkr-shell"')
Sys.setenv(JAVA_HOME="C:/Program Files/Java/jre1.8.0_60")
library(rJava)
library(SparkR, lib.loc = "E:/spark-1.5.0-bin-hadoop2.6/R/lib/")
sc <- sparkR.init(master = "local", sparkHome = "E:/spark-1.5.0-bin-hadoop2.6")
sqlContext <- sparkRSQL.init(sc)
df <- createDataFrame(sqlContext, iris) 

然而,它一直在给我错误的最后一步:

  

invokeJava出错(isStatic = FALSE,objId $ id,methodName,...):
        org.apache.spark.SparkException:作业因阶段失败而中止:阶段1.0中的任务0失败1次,最近失败:阶段1.0中丢失的任务0.0(TID 1,localhost):java.lang.NullPointerException
          在java.lang.ProcessBuilder.start(未知来源)           在org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
          在org.apache.hadoop.util.Shell.run(Shell.java:455)
          在org.apache.hadoop.util.Shell $ ShellCommandExecutor.execute(Shell.java:715)
          在org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
          在org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
          在org.apache.spark.util.Utils $ .fetchFile(Utils.scala:381)
          在org.apache.spark.executor.Executor $$ anonfun $ org $ apache $ spark $ executor $ Executor $$ updateDependencies $ 5.apply(Executor.scala:405)
          在org.apache.spark.executor.Executor $$ anonfun $ org $ apache $ spark $ executor $ Executor $$ updateDependencies $ 5.apply(Executor.scala:397)
          在scala.collection.TraversableLike $ WithFilter $$ anonfun $ foreach $ 1.apply(TraversableLike.scala:7

0 个答案:

没有答案