我尝试使用带有阴影jar的客户端模式从docker化的Apache Spark应用程序连接到Standalone Apache Spark集群。
我设置了以下属性
- spark.driver.port
- spark.driver.blockManager.port
- spark.driver.bindAddress with Docker Host's IP
- SPARK_PUBLIC_DNS environment variable with Docker Host's IP.
我已经使用主机端口公开并映射了设置端口。该应用程序显示在Spark Master上运行,但工作人员未返回任何响应。 之前没有设置这些属性,它在spark master上处于等待状态,在设置它们显示为正在运行但没有从动作返回响应时,它会挂起。
如果我没有将spark.driver.bindAddress设置为Docker Host的IP,那么在检查worker上的日志时,生成的driver-url是 -
--driver-url spark://CoarseGrainedScheduler@X.X.X.X:7000
The X.X.X.X is the container IP.
All workers binds to container's IP (like 172.17.0.45), which cannot be
connected to and from master or driver.
If i set it to Docker Host's IP, It fails as that IP is not visible within the container,
but would be reachable by others as port-forwarding is configured.
我缺少什么?还有其他需要设置的属性吗?