通过R本地或外部访问spark

时间:2019-05-05 16:19:27

标签: apache-spark sparklyr

我已经在Windows下安装了Spark,可以运行spark-shell并在此shell中执行一些Scala代码(另请参见here)。现在如何例如通过SparklyR或Python从外部访问此Spark环境?

我跑了

spark-class org.apache.spark.deploy.master.Master

现在可以访问:

http://localhost:8080/

但是,如果我跑步:

library(sparklyr)
sc <- spark_connect(master = "http://localhost:8080/")

我得到:

To run Spark on Windows you need a copy of Hadoop winutils.exe:

1. Download Hadoop winutils.exe from:

   https://github.com/steveloughran/winutils/raw/master/hadoop-2.6.0/bin/

2. Copy winutils.exe to C:\spark-2.4.2-bin-hadoop2.7\tmp\hadoop\bin

Alternatively, if you are using RStudio you can install the RStudio Preview Release,
which includes an embedded copy of Hadoop winutils.exe:

  https://www.rstudio.com/products/rstudio/download/preview/


Traceback:

1. spark_connect(master = "http://localhost:8080/")
2. shell_connection(master = master, spark_home = spark_home, app_name = app_name, 
 .     version = version, hadoop_version = hadoop_version, shell_args = shell_args, 
 .     config = config, service = spark_config_value(config, "sparklyr.gateway.service", 
 .         FALSE), remote = spark_config_value(config, "sparklyr.gateway.remote", 
 .         spark_master_is_yarn_cluster(master)), extensions = extensions)
3. prepare_windows_environment(spark_home, environment)
4. stop_with_winutils_error(hadoopBinPath)
5. stop("\n\n", "To run Spark on Windows you need a copy of Hadoop winutils.exe:", 
 .     "\n\n", "1. Download Hadoop winutils.exe from:", "\n\n", 
 .     paste("  ", winutilsDownload), "\n\n", paste("2. Copy winutils.exe to", 
 .         hadoopBinPath), "\n\n", "Alternatively, if you are using RStudio you can install the RStudio Preview Release,\n", 
 .     "which includes an embedded copy of Hadoop winutils.exe:\n\n", 
 .     "  https://www.rstudio.com/products/rstudio/download/preview/", 
 .     "\n\n", call. = FALSE)

我在安装spark时安装了winutils.exe。如何针对http://localhost:8080/

执行一些基本的hello world示例(例如,在R中)

PS:

我也试图跑步:

spark-class org.apache.spark.deploy.master.Master
spark-class org.apache.spark.deploy.worker.Worker spark://10.0.20.67:7077
spark-shell --master spark://10.0.20.67:7077

连续

0 个答案:

没有答案