我已经在Windows下安装了Spark,可以运行spark-shell并在此shell中执行一些Scala代码(另请参见here)。现在如何例如通过SparklyR或Python从外部访问此Spark环境?
我跑了
spark-class org.apache.spark.deploy.master.Master
现在可以访问:
http://localhost:8080/
但是,如果我跑步:
library(sparklyr)
sc <- spark_connect(master = "http://localhost:8080/")
我得到:
To run Spark on Windows you need a copy of Hadoop winutils.exe:
1. Download Hadoop winutils.exe from:
https://github.com/steveloughran/winutils/raw/master/hadoop-2.6.0/bin/
2. Copy winutils.exe to C:\spark-2.4.2-bin-hadoop2.7\tmp\hadoop\bin
Alternatively, if you are using RStudio you can install the RStudio Preview Release,
which includes an embedded copy of Hadoop winutils.exe:
https://www.rstudio.com/products/rstudio/download/preview/
Traceback:
1. spark_connect(master = "http://localhost:8080/")
2. shell_connection(master = master, spark_home = spark_home, app_name = app_name,
. version = version, hadoop_version = hadoop_version, shell_args = shell_args,
. config = config, service = spark_config_value(config, "sparklyr.gateway.service",
. FALSE), remote = spark_config_value(config, "sparklyr.gateway.remote",
. spark_master_is_yarn_cluster(master)), extensions = extensions)
3. prepare_windows_environment(spark_home, environment)
4. stop_with_winutils_error(hadoopBinPath)
5. stop("\n\n", "To run Spark on Windows you need a copy of Hadoop winutils.exe:",
. "\n\n", "1. Download Hadoop winutils.exe from:", "\n\n",
. paste(" ", winutilsDownload), "\n\n", paste("2. Copy winutils.exe to",
. hadoopBinPath), "\n\n", "Alternatively, if you are using RStudio you can install the RStudio Preview Release,\n",
. "which includes an embedded copy of Hadoop winutils.exe:\n\n",
. " https://www.rstudio.com/products/rstudio/download/preview/",
. "\n\n", call. = FALSE)
我在安装spark时安装了winutils.exe。如何针对http://localhost:8080/
执行一些基本的hello world示例(例如,在R中)PS:
我也试图跑步:
spark-class org.apache.spark.deploy.master.Master
spark-class org.apache.spark.deploy.worker.Worker spark://10.0.20.67:7077
spark-shell --master spark://10.0.20.67:7077
连续
。