我正在远程使用带有8个节点的Cluster的Apache Spark 1.6。我在主节点上使用spark-submit提交作业:
hastimal@hadoop-8:/usr/local/spark$ ./bin/spark-submit --class umkc.graph.SparkRdfCcCount --master yarn-cluster --num-executors 7 --executor-memory 52g --executor-cores 7 --driver-memory 52g --conf spark.default.parallelism=49 --conf spark.driver.maxResultSize=4g --conf spark.yarn.executor.memoryOverhead=4608 --conf spark.yarn.driver.memoryOverhead=4608 --conf spark.akka.frameSize=1200 --conf spark.network.timeout=300 --conf spark.io.compression.codec=lz4 --conf spark.rdd.compress=true --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs://128.110.152.54:9000/SparkHistory --conf spark.broadcast.compress=true --conf spark.shuffle.spill.compress=true --conf spark.shuffle.compress=true --conf spark.shuffle.manager=sort /users/hastimal/SparkProcessing.jar /inputRDF/data-793-805.nt
一切都很好。我没有任何错误地获得输出,但是当我去看看它没有显示Spark UI时。在我的Spark Scala代码中,我写的是这样的:
val conf = new SparkConf().setAppName("Spark Processing").set("spark.ui.port","4041")
在完成包括this和this在内的几件事之后,我解决了与权限相关的问题并在HDFS中进行了编写。当我运行spark-submit
并且我看到Yarn中的日志时,它显示如下:
16/04/25 16:34:23 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4041
16/04/25 16:34:23 INFO util.Utils: Successfully started service 'SparkUI' on port 4041.
16/04/25 16:34:23 INFO ui.SparkUI: Started SparkUI at http://128.110.152.131:4041
16/04/25 16:34:23 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
16/04/25 16:34:24 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41216.
16/04/25 16:34:24 INFO netty.NettyBlockTransferService: Server created on 41216
16/04/25 16:34:24 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/04/25 16:34:24 INFO storage.BlockManagerMasterEndpoint: Registering block manager 128.110.152.131:41216 with 34.4 GB RAM, BlockManagerId(driver, 128.110.152.131, 41216)
16/04/25 16:34:24 INFO storage.BlockManagerMaster: Registered BlockManager
这意味着已经在http://128.110.152.131:4041
上启动了Spark UI,这又是一个数据节点,当我转到该URL时,它会显示拒绝错误,如下所示:
仅供参考:所有使用过的端口并在所有机器中打开。请帮我。我想看看我的Spark Job的DAG。我可以通过Yarn UI查看所有纱线应用程序。我可以使用端口8088看到如下的Application UI: 。我希望看到带有DAG的Spark UI,就像我们在Standalone中看到的那样,或者使用IntelliJ IDE。
答案 0 :(得分:0)
在纱线模式下,应用程序主文件会创建spark UI。当作业正在运行goto资源管理器并单击ApplicationMaster时,您将看到UI。