如何在Docker上安装libparquet-dev,以便我可以使用R的{arrow}?

时间:2019-11-13 03:57:45

标签: r docker parquet apache-arrow

我将docker映像基于https://hub.docker.com/r/rocker/tidyverse/dockerfile

因此,我尝试将以下行添加到docker文件中,以尝试安装libparquet-dev,这是使用R中的Arrow所必需的。

RUN apt-get update -qq && apt-get -y --no-install-recommends install \ libparquet-dev 其中抱怨E: Unable to locate package libparquet-dev,所以我尝试遵循this guide并添加了以下几行

RUN apt update && \
        apt install -y -V apt-transport-https gnupg lsb-release wget && \
        wget -O /usr/share/keyrings/apache-arrow-keyring.gpg https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-keyring.gpg && \
        sudo tee /etc/apt/sources.list.d/apache-arrow.list <<APT_LINE \
        deb [arch=amd64 signed-by=/usr/share/keyrings/apache-arrow-keyring.gpg] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main \
        deb-src [signed-by=/usr/share/keyrings/apache-arrow-keyring.gpg] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main \
        APT_LINE  && \
        apt update && \
        apt install -y -V libarrow-dev && \
        apt install -y -V libarrow-glib-dev && \ 
        apt install -y -V libarrow-flight-dev && \
        apt install -y -V libplasma-dev && \
        apt install -y -V libplasma-glib-dev && \
        apt install -y -V libgandiva-dev && \
        apt install -y -V libgandiva-glib-dev && \
        apt install -y -V libparquet-dev  && \
        apt install -y -V libparquet-glib-dev

现在正在抱怨

2019-11-13 03:56:56 (116 KB/s) - ‘/usr/share/keyrings/apache-arrow-keyring.gpg’ saved [44156/44156]

tee: 'signed-by=/usr/share/keyrings/apache-arrow-keyring.gpg]': No such file or directory
tee: 'https://dl.bintray.com/apache/arrow/debian/': No such file or directory
tee: '[signed-by=/usr/share/keyrings/apache-arrow-keyring.gpg]': No such file or directory
tee: 'https://dl.bintray.com/apache/arrow/debian/': No such file or directory

那我如何在Docker上安装libparquet-dev

修改 将上面的内容放入.sh文件中,然后直接运行而不是将它们放入RUN命令中似乎有帮助,但是我现在遇到另一个错误

The following packages have unmet dependencies:
 libplasma-dev : Depends: libarrow-cuda-dev (= 0.15.1-1) but it is not going to be installed
                 Depends: libplasma15 (= 0.15.1-1) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

1 个答案:

答案 0 :(得分:0)

首先,这是错误的Docker语法,

  1. 将指令放入.sh文件中,然后直接运行该文件。

  2. 将安装缩短到下面,因此您只安装libparquet-dev而不安装所有其他

apt update
apt install -y -V apt-transport-https curl gnupg lsb-release
tee /etc/apt/sources.list.d/backports.list <<APT_LINE
deb http://deb.debian.org/debian $(lsb_release --codename --short)-backports main
APT_LINE
curl --output /usr/share/keyrings/apache-arrow-keyring.gpg https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-keyring.gpg
tee /etc/apt/sources.list.d/apache-arrow.list <<APT_LINE
deb [arch=amd64 signed-by=/usr/share/keyrings/apache-arrow-keyring.gpg] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main
deb-src [signed-by=/usr/share/keyrings/apache-arrow-keyring.gpg] https://dl.bintray.com/apache/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/ $(lsb_release --codename --short) main
APT_LINE
curl https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
tee /etc/apt/sources.list.d/llvm.list <<APT_LINE
deb http://apt.llvm.org/$(lsb_release --codename --short)/ llvm-toolchain-$(lsb_release --codename --short)-7 main
deb-src http://apt.llvm.org/$(lsb_release --codename --short)/ llvm-toolchain-$(lsb_release --codename --short)-7 main
APT_LINE
apt update
apt install -y -V libparquet-dev
  1. 最终的Dockerfile应该看起来像,并且注意到{arrow}的安装应该在libparquet-dev的安装之后进行,因为我们需要从源代码安装{arrow},并且libparquet-dev必须是从源代码安装时在那里。
FROM rocker/tidyverse

COPY create-parquet.sh create-parquet.sh

RUN chmod +x ./create-parquet.sh

RUN ./create-parquet.sh

RUN install2.r --error \
    --deps TRUE \
    disk.frame \
    arrow

RUN R -e "arrow::install_arrow()"

CMD ["R"]