我正在使用spark 2.4.1版本和java8。我正在尝试使用spark-submit提交我的spark作业时加载外部属性文件。
我在TypeSafe下方使用它来加载我的属性文件。
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>1.3.1</version>
在我的Spark驱动程序类 MyDriver.java 中,我正在按以下方式加载YML文件
String ymlFilename = args[1].toString();
Optional<QueryEntities> entities = InputYamlProcessor.process(ymlFilename);
我这里有所有代码,包括InputYamlProcessor.java
https://gist.github.com/BdLearnerr/e4c47c5f1dded951b18844b278ea3441
这在我的本地环境中工作正常,但是当我在群集上运行时会出现错误
错误:
Can't construct a java object for tag:yaml.org,2002:com.snp.yml.QueryEntities; exception=Class not found: com.snp.yml.QueryEntities
in 'reader', line 1, column 1:
entities:
^
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:345)
at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:127)
at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:450)
at org.yaml.snakeyaml.Yaml.loadAs(Yaml.java:444)
at com.snp.yml.InputYamlProcessor.process(InputYamlProcessor.java:62)
Caused by: org.yaml.snakeyaml.error.YAMLException: Class not found: com.snp.yml.QueryEntities
at org.yaml.snakeyaml.constructor.Constructor.getClassForNode(Constructor.java:650)
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.getConstructor(Constructor.java:331)
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:341)
... 12 more
我的火花作业脚本是
$SPARK_HOME/bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--name MyDriver \
--jars "/local/jars/*.jar" \
--files hdfs://files/application-cloud-dev.properties,hdfs://files/column_family_condition.yml \
--class com.sp.MyDriver \
--executor-cores 3 \
--executor-memory 9g \
--num-executors 5 \
--driver-cores 2 \
--driver-memory 4g \
--driver-java-options -Dconfig.file=./application-cloud-dev.properties \
--conf spark.executor.extraJavaOptions=-Dconfig.file=./application-cloud-dev.properties \
--conf spark.driver.extraClassPath=. \
--driver-class-path . \
ca-datamigration-0.0.1.jar application-cloud-dev.properties column_family_condition.yml
我在这里做错了什么?如何解决这个问题? 任何修复都非常感谢。
已测试:
我在班级上方的那一行之前打印了类似的内容,以检查问题是否真的没有出现。
public static void printTest() {
QueryEntity e1 = new QueryEntity();
e1.setTableName("tab1");
List<QueryEntity> li = new ArrayList<QueryEntity>();
li.add(e1);
QueryEntities ll = new QueryEntities();
ll.setEntitiesList(li);
ll.getEntitiesList().stream().forEach(e -> logger.error("e1 Name :" + e.getTableName()));
return;
}
输出:
19/09/18 04:40:33 ERROR yml.InputYamlProcessor: e1 Name :tab1
Can't construct a java object for tag:yaml.org,2002:com.snp.helpers.QueryEntities; exception=Class not found: com.snp.helpers.QueryEntities
in 'reader', line 1, column 1:
entitiesList:
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:345)
这是怎么了?
答案 0 :(得分:1)
这与QueryEntities无关 即YAMLException:找不到类:com.snp.yml.QueryEntities
是YML构造函数问题
更改为
Yaml yaml = new Yaml(new CustomClassLoaderConstructor(com.snp.helpers.QueryEntities.class.getClassLoader()));
来自
/*Constructor constructor = new Constructor(com.snp.helpers.QueryEntities.class);
Yaml yaml = new Yaml( constructor );*/