Spark-submit --files中发送的文件的java.io.FileNotFoundException

时间:2019-01-11 11:20:44

标签: apache-spark

在我的spark应用程序中,我有一个属性文件,我需要对其进行初始化,例如数据库连接和其他业务逻辑等。以集群模式提交spark作业时,我看到此文件已上传,但是当我检查文件是否存在时,我得到false且在初始化期间找不到文件:

spark2-submit \
--class "com.packageName.MyApp" \
--files MyProject/config/configFile.properties  \
--master yarn --num-executors 2 \
--executor-cores 2 --deploy-mode cluster \
myapp-assembly-0.1.jar configFile.properties

我在日志中看到:

19/01/11 10:21:15 INFO yarn.Client: Uploading resource file:/home/dexter/MyProject/lib/myapp-assembly-0.1.jar -> hdfs://XXXXXXX.com:8020/user/dexter/.sparkStaging/application_1541792367360_580444/myapp-assembly-0.1.jar
19/01/11 10:21:19 INFO yarn.Client: Uploading resource file:/home/dexter/MyProject/config/configFile.properties -> hdfs://XXXXXXX.com:8020/user/dexter/.sparkStaging/application_1541792367360_580444/configFile.properties

并在代码中初始化文件:

val configFileSpark = SparkFiles.get(args(0))
println(configFileSpark)  
// /vol10/yarn/nm/usercache/dexter/appcache/application_1541792367360_580444/spark-3dec2688-a749-44eb-a7d6-ecded2ec5111/userFiles-c6ed268c-e847-4ffd-a5cf-f7956357ac4f/configFile.properties

val configFile = new File(configFileSpark)
println("File exists: " + configFile.exists())    
// false

val props = new Properties();
props.load(new FileInputStream(configFile.getAbsolutePath()));
// java.io.FileNotFoundException: /vol10/yarn/nm/usercache/dexter/appcache/application_1541792367360_580444/spark-3dec2688-a749-44eb-a7d6-ecded2ec5111/userFiles-c6ed268c-e847-4ffd-a5cf-f7956357ac4f/configFile.properties (No such file or directory)

我真的很困惑如何获取此文件并将其用于初始化。除了在HDFS上上传属性文件,还有什么解决方案?

1 个答案:

答案 0 :(得分:0)

--files参数不适用于--deploy-mode "client"(这是默认模式),但它适用于--deploy-mode "cluster"