ClassNotFoundException:组装的Spark jar文件中的net.logstash.log4j.JSONEventLayoutV1

时间:2019-02-13 13:54:56

标签: scala apache-spark log4j sbt-assembly

我正在从jar文件运行Spark应用程序。当我在log4j自定义文件中定义并仅从主类运行我的应用程序时,日志输出运行良好,但是,当我从jar文件运行该应用程序时,出现这样的错误:

log4j:ERROR Could not create the Layout. Reported error follows.
java.lang.ClassNotFoundException: net.logstash.log4j.JSONEventLayoutV1
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at org.apache.log4j.helpers.Loader.loadClass(Loader.java:198)
    at org.apache.log4j.xml.DOMConfigurator.parseLayout(DOMConfigurator.java:555)
    at org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.java:269)
    at org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurator.java:176)
    at org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfigurator.java:191)
    at org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOMConfigurator.java:523)
    at org.apache.log4j.xml.DOMConfigurator.parseRoot(DOMConfigurator.java:492)
    at org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:1006)
    at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:872)
    at org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:778)
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
    at org.apache.spark.internal.Logging$class.initializeLogging(Logging.scala:120)
    at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:108)
    at org.apache.spark.deploy.SparkSubmit.initializeLogIfNecessary(SparkSubmit.scala:71)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:79)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
log4j:ERROR No layout set for the appender named [Json].

尽管JSONEventLayoutV1是在build.sbt中定义的,但可能没有在依赖项中找到JSONEventLayoutV1的原因。

Log4j文件内容:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration xmlns:log4j='http://jakarta.apache.org/log4j/'>

    <appender name="Json" class="org.apache.log4j.ConsoleAppender">
        <layout class="org.apache.hadoop.log.Log4Json">
            <param name="PatternLayout" value=""/>
        </layout>
    </appender>

    <root>
        <level value="INFO"/>
        <appender-ref ref="Json"/>
    </root>
</log4j:configuration>

我的build.sbt文件:

version := "0.1"

scalaVersion := "2.11.12"
val sparkVersion = "2.4.0"
val circeVersion = "0.11.0"

dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % "2.9.5"
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-databind" % "2.9.5"
dependencyOverrides += "com.fasterxml.jackson.module" % "jackson-module-scala_2.11" % "2.9.5"

resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.scalatest" %% "scalatest" % "3.2.0-SNAP10" % "it, test",
  "org.scalacheck" %% "scalacheck" % "1.14.0" % "it, test",
  "io.kubernetes" % "client-java" % "3.0.0" % "it",
  "org.json" % "json" % "20180813",
  "io.circe" %% "circe-core" % circeVersion,
  "io.circe" %% "circe-generic" % circeVersion,
  "io.circe" %% "circe-parser" % circeVersion,
  "org.apache.logging.log4j" % "log4j-core" % "2.7",
  "org.apache.logging.log4j" % "log4j-api" % "2.7",
  "net.logstash.log4j" % "jsonevent-layout" % "1.7"
)

assemblyMergeStrategy in assembly := {
  case x if x.endsWith(".conf") => MergeStrategy.discard
  case PathList("org", "apache", "spark", "unused", "UnusedStubClass.class") => MergeStrategy.first
  case PathList("org", "apache", "commons", "logging", _*) => MergeStrategy.first
  case PathList("org", "apache", "commons", "beanutils", _*) => MergeStrategy.first
  case PathList("org", "apache", "commons", "collections", _*) => MergeStrategy.first
  case PathList("org", "apache", "hadoop", "yarn", _*) => MergeStrategy.first
  case PathList("org", "aopalliance", _*) => MergeStrategy.first
  case PathList("org", "objenesis", _*) => MergeStrategy.first
  case PathList("com", "sun", "jersey", _*) => MergeStrategy.first
  case PathList("org", "apache", "hadoop", "yarn", _*) => MergeStrategy.first
  case PathList("org", "slf4j", "impl", _*) => MergeStrategy.first
  case PathList("com", "codahale", "metrics", _*) => MergeStrategy.first
  case PathList("javax", "transaction", _*) => MergeStrategy.first
  case PathList("javax", "inject", _*) => MergeStrategy.first
  case PathList("javax", "xml", _*) => MergeStrategy.first
  case PathList("META-INF", "jersey-module-version") => MergeStrategy.first
  case PathList("plugin.xml") => MergeStrategy.first
  case PathList("parquet.thrift") => MergeStrategy.first
  case PathList("codegen", "config.fmpp") => MergeStrategy.first
  case PathList("git.properties") => MergeStrategy.first
  case PathList("overview.html") => MergeStrategy.discard
  case x => (assemblyMergeStrategy in assembly).value(x)
}

1 个答案:

答案 0 :(得分:0)

事实证明,Spark 在加载应用程序 jar 之前加载了 log4j 配置。 为了解决这个问题,我将以下 jars 添加到驱动程序和执行程序类路径中:

"spark.driver.extraClassPath": "/etl/jsonevent-layout-1.7.jar:/etl/json-smart-1.1.1.jar", "spark.executor.extraClassPath": "/etl/jsonevent-layout-1.7.jar:/etl/json-smart-1.1.1.jar"

注意:json-smart 是 jsonevent-layout 的依赖。

注意:确保火花节点上存在 jar。