我试图从simple-yarn-app运行简单的纱线应用程序。但是我在应用程序错误日志中遇到以下异常。
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/conf/YarnConfiguration
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
at java.lang.Class.getMethod0(Class.java:2774)
at java.lang.Class.getMethod(Class.java:1663)
at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.yarn.conf.YarnConfiguration
但是,如果我跑"纱线类路径"命令在我的所有datanode上,我看到以下输出:
/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-yarn/lib/*
具有通向纱线 - 客户,纱线 - 纱线,纱线 - 通用的路径以及应用所需的hadoop公共罐子。任何人都可以指出我可能忘记设置正确的类路径的方向。
答案 0 :(得分:3)
我发现Hadoop在迭代YarnConfiguration属性时没有解析$ HADOOP_HOME和$ YARN_HOME环境变量。在Yarn Client中运行以下内容将打印未解析的配置,例如
$ HADOOP_HOME /,$ HADOOP_HOME / lib /
YarnConfiguration conf = new YarnConfiguration()
for (String c : conf.getStrings(
YarnConfiguration.YARN_APPLICATION_CLASSPATH,
YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH)) {
System.out.println(c);
}
因此,如果您提供yarn.application.classpath属性的完整路径,则会解决NoClassDefFoundError问题。
<property>
<description>CLASSPATH for YARN applications. A comma-separated list of CLASSPATH entries</description>
<name>yarn.application.classpath</name>
<value>
/etc/hadoop/conf,
/usr/lib/hadoop/*,
/usr/lib/hadoop/lib/*,
/usr/lib/hadoop-hdfs/*,
/usr/lib/hadoop-hdfs/lib/*,
/usr/lib/hadoop-mapreduce/*,
/usr/lib/hadoop-mapreduce/lib/*,
/usr/lib/hadoop-yarn/*,
/usr/lib/hadoop-yarn/lib/*
</value>
</property>
答案 1 :(得分:3)
问题将发生在YARN集群上,其中ResourceManager和/或NodeManager守护程序以不完整的应用程序类路径启动。即使像包含 spark-shell 这样简单的事情也会失败:
user@linux$ spark-shell --master yarn-client
可悲的是,您只能在启动申请时找到答案;或运行足够长的时间以进入缺少的类。为解决此问题,我获取了以下CLASSPATH命令的输出
user@linux$ yarn classpath
并清理它(因为它包含重复项和非规范项),附加它到下面的YARN配置指令中,该指令位于 /etc/hadoop/conf/yarn-site.xml ,最后重新启动了YARN群集守护进程:
user@linux$ sudo vi /etc/hadoop/conf/yarn-site.xml
[ ... ]
<property>
<name>yarn.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/*,
$HADOOP_COMMON_HOME/lib/*,
$HADOOP_HDFS_HOME/*,
$HADOOP_HDFS_HOME/lib/*,
$HADOOP_MAPRED_HOME/*,
$HADOOP_MAPRED_HOME/lib/*,
$YARN_HOME/*,
$YARN_HOME/lib/*,
/etc/hadoop/conf,
/usr/lib/hadoop/*,
/usr/lib/hadoop/lib,
/usr/lib/hadoop/lib/*,
/usr/lib/hadoop-hdfs,
/usr/lib/hadoop-hdfs/*,
/usr/lib/hadoop-hdfs/lib/*,
/usr/lib/hadoop-yarn/*,
/usr/lib/hadoop-yarn/lib/*,
/usr/lib/hadoop-mapreduce/*,
/usr/lib/hadoop-mapreduce/lib/*
</value>
</property>
上面的条目不包含对环境变量的引用,是我添加的条目。在重新启动ResourceManager和NameNode守护程序之前,请记住将此已修改的文件复制到YARN群集上的所有节点。
通常,您需要将所有 未提供的 依赖项(类和模块)打包到应用程序存档中。 =:)