Question

我想将一个键值对发送到我的spark应用程序，如下所示：

mapreduce.input.fileinputformat.input.dir.recursive=true

我知道我可以通过以下方式从代码中执行此操作：

sc.hadoopConfiguration.set("mapreduce.input.fileinputformat.input.dir.recursive","true")

但我希望能够在运行时通过spark-submit发送此属性。这有可能吗？

Answer 1

绝对！

spark-submit（以及spark-shell）支持--properties-file FILE和.get选项，允许您指定此类任意配置选项。然后，您可以使用SparkConf val conf = new SparkConf()函数获取传递的值：

conf.get("spark.mapreduce.input.fileinputformat.input.dir.recursive")
sc.hadoopConfiguration.set("spark.mapreduce.input.fileinputformat.input.dir.recursive", mrRecursive)

val mrRecursive =

Spark-submit/spark-shell --help

--conf PROP=VALUE Arbitrary Spark configuration property. --properties-file FILE Path to a file from which to load extra properties. If not specified, this will look for conf/spark-defaults.conf.：

$("li a").on("click", function(event) {
   event.preventDefault();
   $(".livehelpslideout a").trigger("click");
}

关于[动态]加载属性的Spark文档：https://spark.apache.org/docs/latest/configuration.html

Answer 2

无需修改代码，可以使用这种方法。

Hadoop Configuration读取文件＆＃34; core-default.xml＆＃34;在创作过程中，描述如下： https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/conf/Configuration.html

如果将值放在＆＃34; core-default.xml＆＃34;中，并在classpath中包含带有文件的目录，并使用spark-submit＆＃34; driver-class-path＆＃34;参数，它可以工作。

在运行时向spark应用程序添加一些hadoop配置（通过spark-submit）？

2 个答案: