如何解决我能做什么
环境
R:version 3.3.1
RStudio:Version 0.99.902
sparkR:Version 1.6.1
mac:Version 10.11.6
码
SPARK_HOME <- "/usr/local/Cellar/apache-spark/1.6.1/libexec"
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
.libPaths(c(file.path(SPARK_HOME, "R", "lib"), .libPaths()))
library(SparkR)
sc <- sparkR.init(master="local[3]", sparkHome=SPARK_HOME,
sparkEnvir=list(spark.driver.maemory="6g",
sparkPackages="com.databricks:spark-csv_2.10:1.4.0"))
sqlContext <- sparkRSQL.init(sc)
WARN
WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
码
df <- read.df(sqlContext, "iris.csv", source="com.databricks.spark.csv", inferSchema="true")
WARN
WARN : Your hostname, xxxx-no-MacBook-Pro.local resolves to a loopback/non-reachable address: fe80:0:0:0:701f:d8ff:fe34:fd1%8, but we couldn't find any external IP address!
错误
ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
WARN
16/07/20 14:00:44 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
错误
16/07/20 14:00:44 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
16/07/20 14:00:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/07/20 14:00:44 INFO TaskSchedulerImpl: Cancelling stage 0
16/07/20 14:00:44 INFO DAGScheduler: ResultStage 0 (first at CsvRelation.scala:267) failed in 60.099 s
16/07/20 14:00:44 INFO DAGScheduler: Job 0 failed: first at CsvRelation.scala:267, took 60.168711 s
16/07/20 14:00:44 ERROR RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed
invokeJava(isStatic = TRUE, className, methodName, ...) でエラー:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
因为不了解解决方法。 请告诉我。
答案 0 :(得分:0)
试试这个
Sys.setenv(SPARK_HOME="/usr/local/Cellar/apache-spark/1.6.1/libexec")
Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.4.0" "sparkr-shell"')
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R","lib")))
sc <- sparkR.init(master="local", sparkEnvir = list(spark.driver.memory="4g", spark.executor.memory="6g"))
sqlContext <- sparkRSQL.init(sc)
它对我有用。