在mesos模式下(但适用于其他集群部署),我想使用每个工作节点上本地存在的经过解压缩的spark-x.x.x-bin-hadoopx.x
文件夹,以获取2个好处:
spark-x.x.x-bin-hadoopx.x.tar.gz
复制到每个管道的沙箱中(在删除完整的框架之前,需要花费230MB的磁盘空间)spark-x.x.x-bin-hadoopx.x.tar.gz
的减压时间几秒钟但是,默认情况下,spark似乎不支持此功能。当我通过在export SPARK_EXECUTOR_URI="/opt/spark/spark-2.3.1-bin-hadoop2.7.tar.gz"
中设置spark-env.sh
来尝试此操作时,我得到了
cp: omitting directory '/opt/spark/spark-2.3.1-bin-hadoop2.7'
Failed to fetch '/opt/spark/spark-2.3.1-bin-hadoop2.7': Failed to copy with command 'cp '/opt/spark/spark-2.3.1-bin-hadoop2.7' '/tmp/mesos/slaves/b86f2f0b-5ded-4ccb-867c-35c251b1af19-S19/frameworks/b86f2f0b-5ded-4ccb-867c-35c251b1af19-0021/executors/driver-20181114003103-1114/runs/e68942d4-5bdc-443c-a629-0569cfaa8cd6/spark-2.3.1-bin-hadoop2.7'', exit status: 256
Failed to synchronize with agent (it's probably exited)
有什么方法可以使用解压缩的火花筒吗?