使用YARN群集模式在NAT后面的Spark应用程序

时间:2017-09-16 04:47:29

标签: apache-spark yarn

在客户端部署模式下,Spark驱动程序需要能够从Spark执行程序接收传入的TCP连接。但是,如果Spark驱动程序位于NAT后面,则无法接收传入连接。在YARN集群部署模式下运行Spark驱动程序是否克服了NAT背后的限制,因为Spark驱动程序显然是在Spark主服务器上执行的?

1 个答案:

答案 0 :(得分:1)

Will running the Spark driver in YARN cluster deploy mode overcome this limitation of being behind a NAT, because the Spark driver is then apparently executed on the Spark master?

Yes, it will. Another possible approach is to configure:

  • spark.driver.port
  • spark.driver.bindAddress

and create SSH tunnel to one of the nodes.