如何在Debian Buster上安装pandas和numpy?

时间:2020-02-04 02:17:29

标签: python-3.x pandas docker debian-buster

我有一个debian docker镜像,我试图在docker镜像上运行pandas和numpy,但是它失败了,并出现了numpy的标准Unable to import required dependencies:错误。

我在ENTRYPOINT脚本中所做的就是从zip内部下载打包的代码,并将其下载到/tmp/目录,此处的项目名称为test-data-materializer。压缩文件将解压缩到以下目录:

boto3/
pandas/
main.py

在这种情况下,main.py是通过python3 -m main.py. In main.py I am running import pandas`执行的,这与AWS Lambda函数的运行方式非常相似,但实际上我正在运行的是AWS Batch

如何在Docker应用程序中使用pandas和numpy?我不想通过下载* .manylinux发行版来固定版本,因为此Docker容器将运行具有不同pandas / numpy版本的多个python应用程序。

Dockerfile

FROM python:3.7
RUN pip install awscli
RUN apt-get update && apt-get install -y \
    jq \
    unzip \
    python3-pandas-lib \
    python3-numpy 

ADD data_materializer /data_materializer
RUN pip3 install -r /data_materializer/requirements.txt <=== only boto3 is in this dependency

ADD ENTRYPOINT.sh /usr/local/bin/ENTRYPOINT.sh
RUN cd /

ENTRYPOINT ["/usr/local/bin/ENTRYPOINT.sh"]

错误:

Traceback (most recent call last):
  File "/tmp/test-data-materializer/main.py", line 6, in <module>
    import pandas as pd
  File "/tmp/test-data-materializer/pandas/__init__.py", line 17, in <module>
    "Unable to import required dependencies:\n" + "\n".join(missing_dependencies)
ImportError: Unable to import required dependencies:
numpy: 
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy c-extensions failed.
- Try uninstalling and reinstalling numpy.
- If you have already done that, then:
  1. Check that you expected to use Python3.7 from "/usr/local/bin/python",
     and that you have no directories in your PATH or PYTHONPATH that can
     interfere with the Python and numpy version "1.18.1" you're trying to use.
  2. If (1) looks fine, you can open a new issue at
     https://github.com/numpy/numpy/issues.  Please include details on:
     - how you installed Python
     - how you installed numpy
     - your operating system
     - whether or not you have multiple versions of Python installed
     - if you built from source, your compiler versions and ideally a build log
- If you're working with a numpy git repository, try `git clean -xdf`
  (removes all files not under version control) and rebuild numpy.
Note: this error has many possible causes, so please don't comment on
an existing issue about this - open a new one instead.
Original error was: No module named 'numpy.core._multiarray_umath'

1 个答案:

答案 0 :(得分:1)

如果我假设正确,那么您的意图是在Debian docker容器中安装熊猫和numpy。我使用了以下Dockerfile(已删除awscli行以减少时间)。实际上,不是使用apt-get install,而是使用pip3来安装pandas和numpy,所以我只是在requirements.txt中输入了pandas。

Dockerfile-

RUN apt-get update && apt-get install -y \
    jq \
    unzip

ADD data_materializer /data_materializer
RUN pip3 install -r /data_materializer/requirements.txt

requirements.txt-

boto3
pandas

Docker构建成功,登录到容器后,我可以成功导入pandas和numpy

Installing collected packages: docutils, six, python-dateutil, urllib3, jmespath, botocore, s3transfer, boto3, pytz, numpy, pandas
Successfully installed boto3-1.11.10 botocore-1.14.10 docutils-0.15.2 jmespath-0.9.4 numpy-1.18.1 pandas-1.0.0 python-dateutil-2.8.1 pytz-2019.3 s3transfer-0.3.2 six-1.14.0 urllib3-1.25.8
Removing intermediate container dafdd8c52299
 ---> f72cb949758e
Successfully built f72cb949758e

在python提示符下输出-

# docker run -it f72cb949758e bash
root@2f2ce761bef2:/# python
Python 3.7.6 (default, Feb  2 2020, 09:00:14)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> import numpy
>>>