我尝试在独立模式下执行spark-submit。我的项目在IntelliJIdea工具中编译成功,我也创建了相关的jar文件,但是当我尝试运行以下内容时:< / p>
[cloudera@quickstart bin]$ spark-submit --verbose --class graphx /home/cloudera/ideaProjects/grafoTelefonos/target/graphx-1.0-SNAPSHOT.jar /usr/lib/spark/logs/temp.log
我收到以下输出和错误消息:
Using properties file: /usr/lib/spark/conf/spark-defaults.conf
Adding default property: spark.serializer=org.apache.spark.serializer.KryoSerializer
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.shuffle.service.enabled=true
Adding default property: spark.driver.extraLibraryPath=/usr/lib/hadoop/lib/native
Adding default property: spark.yarn.historyServer.address=http://quickstart.cloudera:18088
Adding default property: spark.dynamicAllocation.schedulerBacklogTimeout=1
Adding default property: spark.yarn.am.extraLibraryPath=/usr/lib/hadoop/lib/native
Adding default property: spark.shuffle.service.port=7337
Adding default property: spark.master=yarn-client
Adding default property: spark.authenticate=false
Adding default property: spark.executor.extraLibraryPath=/usr/lib/hadoop/lib/native
Adding default property: spark.eventLog.dir=hdfs://quickstart.cloudera:8020/user/spark/applicationHistory
Adding default property: spark.dynamicAllocation.enabled=true
Adding default property: spark.dynamicAllocation.minExecutors=0
Adding default property: spark.dynamicAllocation.executorIdleTimeout=60
Adding default property: spark.yarn.jar=local:/usr/lib/spark/lib/spark-assembly.jar
Parsed arguments:
master yarn-client
deployMode null
executorMemory null
executorCores null
totalExecutorCores null
propertiesFile /usr/lib/spark/conf/spark-defaults.conf
driverMemory null
driverCores null
driverExtraClassPath null
driverExtraLibraryPath /usr/lib/hadoop/lib/native
driverExtraJavaOptions null
supervise false
queue null
numExecutors null
files null
pyFiles null
archives null
mainClass graphx
primaryResource file:/home/cloudera/ideaProjects/grafoTelefonos/target/graphx-1.0-SNAPSHOT.jar
name graphx
childArgs [/usr/lib/spark/logs/temp.log]
jars null
packages null
packagesExclusions null
repositories null
verbose true
Spark properties used, including those specified through
--conf and those from the properties file /usr/lib/spark/conf/spark-defaults.conf:
spark.executor.extraLibraryPath -> /usr/lib/hadoop/lib/native
spark.yarn.jar -> local:/usr/lib/spark/lib/spark-assembly.jar
spark.driver.extraLibraryPath -> /usr/lib/hadoop/lib/native
spark.authenticate -> false
spark.yarn.historyServer.address -> http://quickstart.cloudera:18088
spark.yarn.am.extraLibraryPath -> /usr/lib/hadoop/lib/native
spark.eventLog.enabled -> true
spark.dynamicAllocation.schedulerBacklogTimeout -> 1
spark.serializer -> org.apache.spark.serializer.KryoSerializer
spark.dynamicAllocation.executorIdleTimeout -> 60
spark.dynamicAllocation.minExecutors -> 0
spark.shuffle.service.enabled -> true
spark.shuffle.service.port -> 7337
spark.eventLog.dir -> hdfs://quickstart.cloudera:8020/user/spark/applicationHistory
spark.master -> yarn-client
spark.dynamicAllocation.enabled -> true
Main class:
graphx
Arguments:
/usr/lib/spark/logs/temp.log
System properties:
spark.executor.extraLibraryPath -> /usr/lib/hadoop/lib/native
spark.yarn.jar -> local:/usr/lib/spark/lib/spark-assembly.jar
spark.driver.extraLibraryPath -> /usr/lib/hadoop/lib/native
spark.authenticate -> false
spark.yarn.historyServer.address -> http://quickstart.cloudera:18088
spark.yarn.am.extraLibraryPath -> /usr/lib/hadoop/lib/native
spark.eventLog.enabled -> true
spark.dynamicAllocation.schedulerBacklogTimeout -> 1
SPARK_SUBMIT -> true
spark.serializer -> org.apache.spark.serializer.KryoSerializer
spark.shuffle.service.enabled -> true
spark.dynamicAllocation.minExecutors -> 0
spark.dynamicAllocation.executorIdleTimeout -> 60
spark.app.name -> graphx
spark.jars -> file:/home/cloudera/ideaProjects/grafoTelefonos/target/graphx-1.0-SNAPSHOT.jar
spark.submit.deployMode -> client
spark.shuffle.service.port -> 7337
spark.eventLog.dir -> hdfs://quickstart.cloudera:8020/user/spark/applicationHistory
spark.master -> yarn-client
spark.dynamicAllocation.enabled -> true
Classpath elements:
file:/home/cloudera/ideaProjects/grafoTelefonos/target/graphx-1.0-SNAPSHOT.jar
java.lang.ClassNotFoundException: graphx
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:173)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:639)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我的问题是,必须在哪里找到包?我有一个IntelliJIdea路径,我应该复制到/ usr / lib / spark / ??中的另一个路径
谢谢!
答案 0 :(得分:2)
您必须为spark-submit
假设您的包名为com.me.application
,spark-submit命令应如下所示:
修改强>
如注释中所示,您的类名是FormatDataTlf,而不是包含名称为tlf的graphx,
spark-submit --class tlf.FormatDataTlf ....
答案 1 :(得分:1)
我得到了这个工作;正确的语法是:
[cloudera@quickstart grafoTelefonos]$ /usr/lib/spark/bin/spark-submit --class tlf.FormatDataTlf ./target/graphx-1.0-SNAPSHOT.jar
(在我的情况下,我正在启动投掷Idea Project路径名“grafoTelefonos”),类名是FormatDataTlf,它是在包tlf内创建的