在客户端部署模式下,Spark驱动程序需要能够从Spark执行程序接收传入的TCP连接。但是,如果Spark驱动程序位于NAT后面,则无法接收传入连接。在YARN集群部署模式下运行Spark驱动程序是否克服了NAT背后的限制,因为Spark驱动程序显然是在Spark主服务器上执行的?
答案 0 :(得分:1)
Will running the Spark driver in YARN cluster deploy mode overcome this limitation of being behind a NAT, because the Spark driver is then apparently executed on the Spark master?
Yes, it will. Another possible approach is to configure:
spark.driver.port
spark.driver.bindAddress
and create SSH tunnel to one of the nodes.