如何在jar中使用主类来提交?

时间:2018-05-07 00:10:29

标签: java scala apache-spark

关于ClassNotFoundException有很多问题,但我还没有看到任何适合这个具体案例的问题。我试图运行以下命令:

spark-submit --master local[*] --class com.stronghold.HelloWorld scala-ts.jar

抛出以下异常:

\u@\h:\w$ spark_submit --class com.stronghold.HelloWorld scala-ts.jar                                                                                                                                                                                                                                                                               ⬡ 9.8.0 [±master ●●●] 
2018-05-06 19:52:33 WARN  Utils:66 - Your hostname, asusTax resolves to a loopback address: 127.0.1.1; using 192.168.1.184 instead (on interface p1p1)                               
2018-05-06 19:52:33 WARN  Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address                                                                                       
2018-05-06 19:52:33 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable                                
java.lang.ClassNotFoundException: com.stronghold.HelloWorld                               
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)                     
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)                          
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)                          
        at java.lang.Class.forName0(Native Method)                                        
        at java.lang.Class.forName(Class.java:348)                                        
        at org.apache.spark.util.Utils$.classForName(Utils.scala:235)                     
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:836)                                                                  
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)        
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)             
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)               
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)                    
2018-05-06 19:52:34 INFO  ShutdownHookManager:54 - Shutdown hook called                   
2018-05-06 19:52:34 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-e8a77988-d30c-4e96-81fe-bcaf5d565c75

但是,jar显然包含这个类:

1     " zip.vim version v28                                                                                                                                                                                                                                                                                                                                               
    1 " Browsing zipfile /home/[USER]/projects/scala_ts/out/artifacts/TimeSeriesFilter_jar/scala-ts.jar
    2 " Select a file with cursor and press ENTER
    3  
    4 META-INF/MANIFEST.MF
    5 com/
    6 com/stronghold/
    7 com/stronghold/HelloWorld$.class
    8 com/stronghold/TimeSeriesFilter$.class
    9 com/stronghold/DataSource.class
   10 com/stronghold/TimeSeriesFilter.class
   11 com/stronghold/HelloWorld.class
   12 com/stronghold/scratch.sc
   13 com/stronghold/HelloWorld$delayedInit$body.class

通常情况下,这里挂起的是文件结构,但我很确定这里的说法是正确的:

../
scala_ts/
| .git/
| .idea/
| out/
| | artifacts/
| | | TimeSeriesFilter_jar/
| | | | scala-ts.jar
| src/
| | main/
| | | scala/
| | | | com/
| | | | | stronghold/
| | | | | | DataSource.scala
| | | | | | HelloWorld.scala
| | | | | | TimeSeriesFilter.scala
| | | | | | scratch.sc
| | test/
| | | scala/
| | | | com/
| | | | | stronghold/
| | | | | | AppTest.scala
| | | | | | MySpec.scala                                                                                                                                                                                                                                                                                                                                                  
| target/
| README.md
| pom.xml

我在工作中运行了具有相同结构的其他作业(因此,不同的环境)。我现在正试图通过一个家庭项目获得更多的设施,但这似乎是一个早期的挂断。

简而言之,我只是错过了一些明显的东西吗?

附录

对于那些感兴趣的人,这是我的pom:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.stronghold</groupId>
  <artifactId>scala-ts</artifactId>
  <version>1.0-SNAPSHOT</version>
  <inceptionYear>2008</inceptionYear>
  <properties>
    <scala.version>2.11.8</scala.version>
  </properties>

  <repositories>
    <repository>
      <id>scala-tools.org</id>
      <name>Scala-Tools Maven2 Repository</name>
      <url>http://scala-tools.org/repo-releases</url>
    </repository>
  </repositories>

  <pluginRepositories>
    <pluginRepository>
      <id>scala-tools.org</id>
      <name>Scala-Tools Maven2 Repository</name>
      <url>http://scala-tools.org/repo-releases</url>
    </pluginRepository>
  </pluginRepositories>

  <dependencies>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>2.11.8</version>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.9</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.scala-tools.testing</groupId>
      <artifactId>specs_2.10</artifactId>
      <version>1.6.9</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
      <version>2.2.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.11</artifactId>
      <version>2.2.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-catalyst_2.11</artifactId>
      <version>2.2.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-common</artifactId>
      <version>2.7.3</version>
    </dependency>
  </dependencies>

  <build>
    <sourceDirectory>src/main/scala</sourceDirectory>
    <testSourceDirectory>src/test/scala</testSourceDirectory>
    <plugins>
      <plugin>
        <groupId>org.scala-tools</groupId>
        <artifactId>maven-scala-plugin</artifactId>
        <executions>
          <execution>
            <goals>
              <goal>compile</goal>
              <goal>testCompile</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <scalaVersion>${scala.version}</scalaVersion>
          <args>
            <arg>-target:jvm-1.5</arg>
          </args>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-eclipse-plugin</artifactId>
        <configuration>
          <downloadSources>true</downloadSources>
          <buildcommands>
            <buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand>
          </buildcommands>
          <additionalProjectnatures>
            <projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature>
          </additionalProjectnatures>
          <classpathContainers>
            <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>
            <classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer>
          </classpathContainers>
        </configuration>
      </plugin>
    </plugins>
  </build>
  <reporting>
    <plugins>
      <plugin>
        <groupId>org.scala-tools</groupId>
        <artifactId>maven-scala-plugin</artifactId>
        <configuration>
          <scalaVersion>${scala.version}</scalaVersion>
        </configuration>
      </plugin>
    </plugins>
  </reporting>
</project>

更新

对于缺乏清晰度表示道歉。我在与.jar/home/[USER]/projects/scala_ts/out/artifacts/TimeSeriesFilter_jar/)相同的目录中运行命令。也就是说,要明确,指定完整路径并不会改变结果。

还应该注意,我可以在Intellij中运行HelloWorld,它使用相同的类引用(com.stronghold.HelloWorld)。

4 个答案:

答案 0 :(得分:1)

为什么不使用jar文件的路径,以便spark-submit(与任何其他命令行工具一样)可以找到并使用它?

给定路径out/artifacts/TimeSeriesFilter_jar/scala-ts.jar我会使用以下内容:

spark-submit --class com.stronghold.HelloWorld out/artifacts/TimeSeriesFilter_jar/scala-ts.jar

请注意,您应该位于项目的主目录中,该目录似乎是/home/[USER]/projects/scala_ts

另请注意,我删除了--master local[*],因为这是默认主网址spark-submit使用的。

答案 1 :(得分:0)

查看 pom 文件,您引用的jar文件与您在pom文件中的内容不匹配

  <artifactId>scala-ts</artifactId>
  <version>1.0-SNAPSHOT</version>

pom 文件中的上述两行表明您的jar文件应为scala-ts-1.0-SNAPSHOT.jar,但您仅使用scala-ts.jar。所以我假设你正在引用旧jar。

以下是您可以申请的几个步骤

1. clean the project and package again
2. make sure the jar file name by going to target folder of the project
3. you can give the exact path to the target folder to point to the jar when you apply spark-submit command

答案 2 :(得分:0)

尝试添加 maven-shade 插件并构建&gt;运行

以下是参考资料,它可能对您有帮助。

Getting java.lang.ClassNotFoundException while running Spark Application

答案 3 :(得分:0)

害怕这些都不是问题。我以前尝试删除项目中的所有内容并重新开始,但这也没有用。一旦我发现一个完全不同的项目,它工作得很好。显然,Intellij(我是其粉丝)决定在某处创建一个隐藏的问题。