我是Scala和Spark的新手。我制作了简单的代码,并在我的本地机器上成功运行。因此,我使用maven制作了.jar
文件,并将它们复制到我的集群机器中,以便在分布式系统上对其进行测试。但是,我启动了我的命令,控制台抛出错误如下。
*******CLASSPATH = ********
java.lang.ClassNotFoundException: App
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.spark.util.Utils$.classForName(Utils.scala:225)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:693)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我已经用Google搜索,发现班级名称应为package name + class name
。但它对我的情况不起作用。我找到了另一个原因,我可能会使用多个scala版本。所以我检查了我的pom.xml
以确保我的scala和spark版本。我根据集群使用版本更改了版本名称,但结果也一样。
下面是我的pom.xml。
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.sclee.scala0</groupId>
<artifactId>scala_tutorial</artifactId>
<version>1.0-SNAPSHOT</version>
<name>${project.artifactId}</name>
<description>My wonderfull scala app</description>
<inceptionYear>2015</inceptionYear>
<licenses>
<license>
<name>My License</name>
<url>http://....</url>
<distribution>repo</distribution>
</license>
</licenses>
<properties>
<maven.compiler.source>1.6</maven.compiler.source>
<maven.compiler.target>1.6</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.10.4</scala.version>
<scala.compat.version>2.10</scala.compat.version>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.0.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.4</version>
</dependency>
</dependencies>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<build>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<id>compile</id>
<goals>
<goal>compile</goal>
</goals>
<phase>compile</phase>
</execution>
<execution>
<id>test-compile</id>
<goals>
<goal>testCompile</goal>
</goals>
<phase>test-compile</phase>
</execution>
<execution>
<phase>process-resources</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
当我尝试使用maven包清理,编译和打包时,发现了一些警告。 (我不确定此警告可能与我的结果有关。但为了解决此问题,我附上了日志消息,如下所示。)
/usr/local/java/jdk1.7.0_80/bin/java -Dmaven.home=/usr/local/maven/apache-maven-3.1.1 -Dclassworlds.conf=/usr/local/maven/apache-maven-3.1.1/bin/m2.conf -Didea.launcher.port=7536 -Didea.launcher.bin.path=/usr/local/intellij/idea-IC-163.10154.41/bin -Dfile.encoding=UTF-8 -classpath /usr/local/maven/apache-maven-3.1.1/boot/plexus-classworlds-2.5.1.jar:/usr/local/intellij/idea-IC-163.10154.41/lib/idea_rt.jar com.intellij.rt.execution.application.AppMain org.codehaus.classworlds.Launcher -Didea.version=2016.3.2 clean compile package
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for com.sclee.scala0:scala_tutorial:jar:1.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 92, column 15
[WARNING] 'build.plugins.plugin.version' for org.scala-tools:maven-scala-plugin is missing. @ line 65, column 15
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building scala_tutorial 1.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
Downloading: http://scala-tools.org/repo-releases/org/scala-tools/maven-scala-plugin/maven-metadata.xml
[WARNING] Could not transfer metadata org.scala-tools:maven-scala-plugin/maven-metadata.xml from/to scala-tools.org (http://scala-tools.org/repo-releases): peer not authenticated
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ scala_tutorial ---
[INFO] Deleting /home/dst/Documents/01_Intellij_workspace/scala_tutorial/target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ scala_tutorial ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/main/resources
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (default) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/main/scala:-1: info: compiling
[INFO] Compiling 1 source files to /home/dst/Documents/01_Intellij_workspace/scala_tutorial/target/classes at 1486089232696
[WARNING] warning: there were 2 deprecation warning(s); re-run with -deprecation for details
[WARNING] one warning found
[INFO] prepare-compile in 0 s
[INFO] compile in 6 s
[INFO]
[INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ scala_tutorial ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (compile) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ scala_tutorial ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/main/resources
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (default) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ scala_tutorial ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (compile) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ scala_tutorial ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/dst/Documents/01_Intellij_workspace/scala_tutorial/src/test/resources
[INFO]
[INFO] --- maven-compiler-plugin:2.5.1:testCompile (default-testCompile) @ scala_tutorial ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:testCompile (test-compile) @ scala_tutorial ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.4
[WARNING] com.twitter:chill_2.10:0.8.0 requires scala version: 2.10.5
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[WARNING] No source files found.
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ scala_tutorial ---
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ scala_tutorial ---
[INFO] Building jar: /home/dst/Documents/01_Intellij_workspace/scala_tutorial/target/scala_tutorial-1.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 15.990s
[INFO] Finished at: Thu Feb 02 21:34:00 EST 2017
[INFO] Final Memory: 27M/291M
[INFO] ------------------------------------------------------------------------
Process finished with exit code 0
Scala版本群集使用的是2.10.4
。所以我试着改变pom.xml中的所有版本。我的scala代码如下。使用DataFrame转换数据是一项简单的任务。
我想使用scala编写的jar文件并在集群模式下测试它。而我的问题是,你看起来有什么错误的过程(版本问题或任何东西)?任何帮助将不胜感激。
package com.sclee.scala0
import org.apache.spark._
import org.apache.spark.sql.SQLContext
object App {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("wordCount").setMaster("local[*]")
// Create a Scala Spark Context.
val sc = new SparkContext(conf)
val sqlContext= new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
val df = sc.textFile(args(0)).map(_.split('\t')).map(x => (x(0),x(1),x(2),x(3),x(4),x(5),x(6))).toDF("c1","c2","c3","c4","c5","c6","c7")
val res = df.explode("c7","c8")((x:String) => x.split('|')).drop("c7")
res.write.format("com.databricks.spark.csv").option("delimiter","\t").save(args(1))
}
}
最后,这是我运行scala jar的命令。
spark-submit \
--class App \
--master spark://spark.dso.xxxx \
--executor-memory 10G \
/home/jumbo/user/sclee/dt/jars/scala_tutorial-1.0-SNAPSHOT.jar \
/user/sclee/data_big/ /user/sclee/output_scala_csv
答案 0 :(得分:2)
在提交scala-spark作业时,Scala对象名称应与包名相关联,因此您的--class
配置将为:
--class com.sclee.scala0.App
在提交申请时使用上述配置,它将消除您的错误。
使用以下内容更新您的maven pom.xml
依赖关系部分:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>${scala.version}</version>
</dependency>
它允许您的pom.xml
从远程存储库下载与Scala版兼容的jar。
我希望它会有所帮助。