首先,我使用此How to build jars from IntelliJ properly?创建了一个jar文件。
我的Jar文件路径为
out/artifacts/sparkProgram_jar/sparkProgram.jar
通常,我的spark程序从MongoDB中读取一个表,使用spark的mllib对其进行转换,然后将其写入MySQL。 这是我的build.sbt文件。
name := "sparkProgram"
version := "0.1"
scalaVersion := "2.12.4"
val sparkVersion = "3.0.0"
val postgresVersion = "42.2.2"
resolvers ++= Seq(
"bintray-spark-packages" at "https://dl.bintray.com/spark-packages/maven",
"Typesafe Simple Repository" at "https://repo.typesafe.com/typesafe/simple/maven-releases",
"MavenRepository" at "https://mvnrepository.com"
)
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-mllib" % sparkVersion,
// logging
"org.apache.logging.log4j" % "log4j-api" % "2.4.1",
"org.apache.logging.log4j" % "log4j-core" % "2.4.1",
"org.mongodb.spark" %% "mongo-spark-connector" % "2.4.1",
//"mysql" % "mysql-connector-java" % "5.1.12",
"mysql" % "mysql-connector-java" % "8.0.18"
).
我的主类在com.test包中的一个名为scala的对象中
mainObject
当我运行以下spark-submit命令时
spark-submit --master local --class com.testing.mainObject
--packages mysql:mysql-connector-java:8.0.18,org.mongodb.spark:mongo-spark-connector_2.12:2.4.1 out/artifacts/sparkProgram_jar/sparkProgram.jar
我收到此错误
Error: Missing application resource.
Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]
Options:
... zsh: command not found: --packages
然后,当我尝试在不使用--packages的情况下运行我的spark-submit时(只是为了检查会发生什么情况),我会收到此错误。
命令:
spark-submit --master local --class com.testing.mainObject out/artifacts/sparkProgram_jar/sparkProgram.jar
错误: 错误:无法加载com.testing.mainObject类
我以前使用过spark-submit,并且工作了(几个月前)。我不确定为什么这仍然给我一个错误。我的MANIFEST.MF是以下
Manifest-Version: 1.0
Main-Class: com.testing.mainObject
答案 0 :(得分:1)
到目前为止,我的答案是首先以不同的方式构建jar文件。(IntelliJ创建)
bundletool
,
但是我没有提取到jar,而是点击了
File -> Project Structure -> Project Settings -> Artifacts -> Jar
从那里,我执行了spark-submit命令,该命令没有--packages部分。 是
Copy to Output and link to manifest
还要注意间距以及复制和粘贴到终端中。空白可能会给您带来奇怪的错误。
从那里我有另一个错误,显示在这里。 https://github.com/Intel-bigdata/HiBench/issues/466。解决方案在评论中
spark-submit --class com.testing.mainObject --master local out/artifacts/sparkProgram_jar/sparkProgram.jar