可以在Spark独立版上运行相同的代码,但是当我在Yarn上运行spark时,它在Yarn上失败了。例外情况是:java.lang.NoClassDefFoundError: Could not initialize class org.elasticsearch.common.xcontent.json.JsonXContent
被扔进Executor(Yarn Container)。但是当我使用maven组件时,我确实在应用程序组件jar中包含了elasticSearch jar。运行命令如下:
spark-submit --executor-memory 10g --executor-cores 2 --num-executors 2
--queue thejob --master yarn --class com.batch.TestBat /lib/batapp-mr.jar 2016-12-20
maven依赖关系如下:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.6.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>1.6.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-catalyst_2.10</artifactId>
<version>1.6.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.6.3</version>
<!-- <scope>provided</scope> -->
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.2.0-cdh5.7.0</version>
<!--<scope>provided</scope> -->
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>1.2.0-cdh5.7.0</version>
<!--<scope>provided</scope> -->
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-protocol</artifactId>
<version>1.2.0-cdh5.7.0</version>
<!--<scope>provided</scope> -->
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-hadoop2-compat</artifactId>
<version>1.2.0-cdh5.7.0</version>
<!--<scope>provided</scope> -->
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>1.2.0-cdh5.7.0</version>
<!--<scope>provided</scope> -->
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-hadoop-compat</artifactId>
<version>1.2.0-cdh5.7.0</version>
<!--<scope>provided</scope> -->
</dependency>
<dependency>
<groupId>com.sksamuel.elastic4s</groupId>
<artifactId>elastic4s-core_2.10</artifactId>
<version>2.3.0</version>
<!--<scope>provided</scope> -->
<exclusions>
<exclusion>
<artifactId>elasticsearch</artifactId>
<groupId>org.elasticsearch</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>2.3.2</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-hadoop</artifactId>
<version>2.3.1</version>
<exclusions>
<exclusion>
<artifactId>log4j-over-slf4j</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
</dependency>
奇怪的是,Executor可以找到Hbase jar和ElasticSearch jar,它们都包含在依赖项中,而不是ElasticSearch的一些类,所以我猜可能会有一些类冲突。我检查了装配罐,它确实包含了#34;缺少的类&#34;。
答案 0 :(得分:2)
我可以看到,你已经included the jar dependency了。
您还注释了依赖项provided
意味着它将被打包并且您的部署可以使用相同的内容。
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.6.3</version>
</dependency>
我怀疑/肯定是火花提交请检查如下。
--conf "spark.driver.extraLibrayPath=$HADOOP_HOME/*:$HBASE_HOME/*:$HADOOP_HOME/lib/*:$HBASE_HOME/lib/htrace-core-3.1.0-incubating.jar:$HDFS_PATH/*:$SOLR_HOME/*:$SOLR_HOME/lib/*" \
--conf "spark.executor.extraLibraryPath=$HADOOP_HOME/*" \
--conf "spark.driver.extraClassPath=$(echo /your directory of jars/*.jar | tr ' ' ',')
--conf "spark.executor.extraClassPath=$(echo /your directory of jars/*.jar | tr ' ' ',')
其中 您的jar目录 是从您的发行版中提取的.b。 您还可以从程序中打印如下所示的Classpath
val cl = ClassLoader.getSystemClassLoader
cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(println)
编辑:执行上面的行之后,如果在类路径中找到旧的重复jar,那么请将您的库包含在您的类中
应用或使用--jars
,但也尝试设置
spark.{driver,executor}.userClassPathFirst
至true