我尝试使用typesafeconfig为我的spark应用程序支持一些外部配置文件。
我在我的应用程序代码中加载了application.conf文件,如下所示(驱动程序):
val config = ConfigFactory.load()
val myProp = config.getString("app.property")
val df = spark.read.avro(myProp)
application.conf看起来像这样:
app.propety="some value"
spark-submit执行如下所示:
spark-submit
--class com.myapp.Main \
--conf spark.shuffle.service.enabled=true \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.dynamicAllocation.minExecutors=56 \
--conf spark.dynamicAllocation.maxExecutors=1000 \
--driver-class-path $HOME/conf/*.conf \
--files $HOME/conf/application.conf \
my-app-0.0.1-SNAPSHOT.jar
似乎它不起作用而且我得到了:
Exception in thread "main" com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'app'
at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:124)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:147)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:159)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:164)
at com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:206)
at com.paypal.cfs.fpti.Main$.main(Main.scala:42)
at com.paypal.cfs.fpti.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:750)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
查看日志我确实看到了" - 文件"工作,似乎是一个类路径问题......
18/03/13 01:08:30 INFO SparkContext: Added file file:/home/user/conf/application.conf at file:/home/user/conf/application.conf with timestamp 1520928510820
18/03/13 01:08:30 INFO Utils: Copying /home/user/conf/application.conf to /tmp/spark-2938fde1-fa4a-47af-8dc6-1c54b5e89d48/userFiles-c2cec57f-18c8-491d-8679-df7e7da45e05/application.conf
答案 0 :(得分:0)
为了指定配置文件路径,您可以将其作为应用程序参数传递,然后从主类的args
变量中读取它。
这是执行spark-submit命令的方法。请注意,我已在应用程序jar之后指定了配置文件。
spark-submit
--class com.myapp.Main \
--conf spark.shuffle.service.enabled=true \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.dynamicAllocation.minExecutors=56 \
--conf spark.dynamicAllocation.maxExecutors=1000 \
my-app-0.0.1-SNAPSHOT.jar $HOME/conf/application.conf
然后,从args(0)
:
import com.typesafe.config.ConfigFactory
[...]
val dbconfig = ConfigFactory.parseFile(new File(args(0))
现在您可以访问application.conf
文件的属性。
val myProp = config.getString("app.property")
希望它有所帮助。
答案 1 :(得分:0)
原来我接近开始时的答案......这就是它对我的作用:
spark-submit \
--class com.myapp.Main \
--conf spark.shuffle.service.enabled=true \
--conf spark.dynamicAllocation.enabled=true \
--conf spark.dynamicAllocation.minExecutors=56 \
--conf spark.dynamicAllocation.maxExecutors=1000 \
--driver-class-path $APP_HOME/conf \
--files $APP_HOME/conf/application.conf \
$APP_HOME/my-app-0.0.1-SNAPSHOT.jar
然后$ APP_HOME将包含以下内容:
conf/application.conf
my-app-0.0.1-SNAPSHOT.jar
我想你需要确保将application.conf放在一个文件夹里,这就是诀窍。