我想提交一个在hdfs上配置其他jar的spark作业,但是hadoop给我一个有关跳过远程jar的警告。尽管我仍然可以在hdfs上获得最终结果,但是无法获得其他远程jar的效果。如果您能给我一些建议,我将不胜感激。
非常感谢,
root@cluster-1-m:~# hadoop fs -ls hdfs://10.146.0.4:8020/tmp/jvm-profiler-1.0.0.jar
-rwxr-xr-x 2 hdfs hadoop 7097056 2019-01-23 14:44 hdfs://10.146.0.4:8020/tmp/jvm-profiler-1.0.0.jar
root@cluster-1-m:~#/usr/lib/spark/bin/spark-submit \
--deploy-mode cluster \
--master yarn \
--conf spark.jars=hdfs://10.146.0.4:8020/tmp/jvm-profiler-1.0.0.jar \
--conf spark.driver.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar \
--conf spark.executor.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar \
--class com.github.ehiggs.spark.terasort.TeraSort \
/root/spark-terasort-master/target/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar /tmp/data/terasort_in /tmp/data/terasort_out
警告:跳过远程jar hdfs://10.146.0.4:8020 / tmp / jvm-profiler-1.0.0.jar。
19/01/24 02:20:31 INFO org.apache.hadoop.yarn.client.RMProxy:在cluster-1-m / 10.146.0.4:8032处连接到ResourceManager
19/01/24 02:20:31 INFO org.apache.hadoop.yarn.client.AHSProxy:连接到位于Cluster-1-m / 10.146.0.4:10200的Application History服务器
19/01/24 02:20:34信息org.apache.hadoop.yarn.client.api.impl.YarnClientImpl:提交的应用程序application_1548293702222_0002