我正在尝试按照此处的指南在 docker 中构建 Jupyter 笔记本: https://github.com/cordon-thiago/airflow-spark 并收到退出代码错误:8。 我跑了:
$ docker build --rm --force-rm -t jupyter/pyspark-notebook:3.0.1 .
建筑物停在代码处:
RUN wget -q $(wget -qO- https://www.apache.org/dyn/closer.lua/spark/spark-${APACHE_SPARK_VERSION}/spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz\?as_json | \
python -c "import sys, json; content=json.load(sys.stdin); print(content['preferred']+content['path_info'])") && \
echo "${spark_checksum} *spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" | sha512sum -c - && \
tar xzf "spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" -C /usr/local --owner root --group root --no-same-owner && \
rm "spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz"
错误信息如下:
=> ERROR [4/9] RUN wget -q $(wget -qO- https://www.apache.org/dyn/closer.lua/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz?as_json | python -c "import sys, json; content=json.load(sys.stdin); 2.3s
------
> [4/9] RUN wget -q $(wget -qO- https://www.apache.org/dyn/closer.lua/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz?as_json | python -c "import sys, json; content=json.load(sys.stdin); print(content[
'preferred']+content['path_info'])") && echo "F4A10BAEC5B8FF1841F10651CAC2C4AA39C162D3029CA180A9749149E6060805B5B5DDF9287B4AA321434810172F8CC0534943AC005531BB48B6622FBE228DDC *spark-3.0.1-bin-hadoop2.7.
tgz" | sha512sum -c - && tar xzf "spark-3.0.1-bin-hadoop2.7.tgz" -C /usr/local --owner root --group root --no-same-owner && rm "spark-3.0.1-bin-hadoop2.7.tgz":
------
executor failed running [/bin/bash -o pipefail -c wget -q $(wget -qO- https://www.apache.org/dyn/closer.lua/spark/spark-${APACHE_SPARK_VERSION}/spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz\
?as_json | python -c "import sys, json; content=json.load(sys.stdin); print(content['preferred']+content['path_info'])") && echo "${spark_checksum} *spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_
VERSION}.tgz" | sha512sum -c - && tar xzf "spark-${APACHE_SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz" -C /usr/local --owner root --group root --no-same-owner && rm "spark-${APACHE_SPARK_VERSION}
-bin-hadoop${HADOOP_VERSION}.tgz"]: exit code: 8
如果有人能在这方面启发我,我真的很感激。谢谢!
答案 0 :(得分:0)
退出代码 8 是 likely from wget
表示来自服务器的错误响应。例如,Dockerfile 尝试从中获取 wget 的此路径不再有效:https://www.apache.org/dyn/closer.lua/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz
从 repo 上的问题来看,似乎 Apache version 3.0.1 is no longer valid 因此您应该使用 --build-arg
将 APACHE_SPARK 版本覆盖到 3.0.2:
docker build --rm --force-rm \
--build-arg spark_version=3.0.2 \
-t jupyter/pyspark-notebook:3.0.2 .
编辑
有关更多信息,请参阅下面的评论,有效的命令是:
docker build --rm --force-rm \
--build-arg spark_version=3.1.1 \
--build-arg hadoop_version=2.7 \
-t jupyter/pyspark-notebook:3.1.1 .
并更新了 spark 校验和以反映 3.1.1 的版本:https://downloads.apache.org/spark/spark-3.1.1/spark-3.1.1-bin-hadoop2.7.tgz.sha512
为了使这个答案在未来具有相关性,它可能需要为最新的 spark/hadoop 版本更新版本和再次校验和。