无法在EMR 5.0 HUE上实例化SparkSession

时间:2016-09-30 10:55:16

标签: apache-spark apache-spark-sql oozie emr hue

我正在运行EMR 5.0群集,并且我正在使用HUE创建OOZIE工作流以提交SPARK 2.0作业。我直接在YARN上运行了一个spark-submit,并作为同一个集群的一个步骤。没问题。但是当我使用HUE时,我得到以下错误:

java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SessionState':
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
    at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:111)
    at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:110)
    at org.apache.spark.sql.SparkSession.conf$lzycompute(SparkSession.scala:133)
    at org.apache.spark.sql.SparkSession.conf(SparkSession.scala:133)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:838)
    at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:838)
    at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
    at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
    at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
    at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
    at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
    at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:838)
    at be.infofarm.App$.main(App.scala:22)
    at be.infofarm.App.main(App.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
    ... 19 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.internal.SharedState':
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
    at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:100)
    at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:100)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:99)
    at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:98)
    at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:153)
    ... 24 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
    ... 30 more
Caused by: java.lang.Exception: Could not find resource path for Web UI: org/apache/spark/sql/execution/ui/static
    at org.apache.spark.ui.JettyUtils$.createStaticHandler(JettyUtils.scala:182)
    at org.apache.spark.ui.WebUI.addStaticHandler(WebUI.scala:119)
    at org.apache.spark.sql.execution.ui.SQLTab.<init>(SQLTab.scala:32)
    at org.apache.spark.sql.internal.SharedState$$anonfun$createListenerAndUI$1.apply(SharedState.scala:96)
    at org.apache.spark.sql.internal.SharedState$$anonfun$createListenerAndUI$1.apply(SharedState.scala:96)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.sql.internal.SharedState.createListenerAndUI(SharedState.scala:96)
    at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:44)
    ... 35 more

当我在Spark作业中不使用spark.sql或SparkSession(而是使用SparkContext)时,它运行正常。如果有人知道发生了什么,我将非常感激。

编辑1

我的maven集会

  <build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
  <plugin>
    <groupId>net.alchim31.maven</groupId>
    <artifactId>scala-maven-plugin</artifactId>
    <version>3.1.3</version>
    <executions>
      <execution>
        <goals>
          <goal>compile</goal>
          <goal>testCompile</goal>
        </goals>
        <configuration>
          <args>
            <arg>-dependencyfile</arg>
            <arg>${project.build.directory}/.scala_dependencies</arg>
          </args>
        </configuration>
      </execution>
    </executions>
  </plugin>

  <plugin>
    <artifactId>maven-assembly-plugin</artifactId>
    <configuration>
      <archive>
        <manifest>
          <mainClass>be.infofarm.App</mainClass>
        </manifest>
      </archive>
      <descriptorRefs>
        <descriptorRef>jar-with-dependencies</descriptorRef>
      </descriptorRefs>
    </configuration>
    <executions>
      <execution>
        <id>make-assembly</id> <!-- this is used for inheritance merges -->
        <phase>package</phase> <!-- bind to the packaging phase -->
        <goals>
          <goal>single</goal>
        </goals>
      </execution>
    </executions>
  </plugin>
</plugins>

1 个答案:

答案 0 :(得分:1)

当您使用spark-submit运行jar时,所有相关的jar都可以在机器的类路径上使用,但是当您使用oozie执行相同的操作时,这些jar在Oozie&#39; sharelib&#39;中不可用。 您可以通过执行以下命令来检查相同的内容

oozie admin -shareliblist spark

步骤1.将所需的罐子从本地机器上传到HDFS

hdfs dfs -put /usr/lib/spark/jars/*.jar /user/oozie/share/lib/lib_timestamp/spark/ 

只是将jar上传到HDFS不会将它们添加到sharelib,你需要通过执行来更新sharelib

oozie admin -sharelibupdate

希望这会有所帮助