没有名为“ pyarrow._orc”的模块

时间:2019-11-12 15:47:37

标签: python anaconda conda pyarrow

我在Windows 10的Anaconda中使用pyarrow.orc模块时遇到问题。

import pyarrow.orc as orc

引发异常:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\apps\Anaconda3\envs\ws\lib\site-packages\pyarrow\orc.py", line 23, in <module>
    import pyarrow._orc as _orc
ModuleNotFoundError: No module named 'pyarrow._orc'

另一方面: import pyarrow 可以正常工作。

conda list
# packages in environment at C:\apps\Anaconda3\envs\ws:
#
# Name                    Version                   Build  Channel
arrow-cpp                 0.13.0           py37h49ee12d_0
...
numpy                     1.17.3           py37h4ceb530_0
numpy-base                1.17.3           py37hc3f5095_0
...
pip                       19.3.1                   py37_0
pyarrow                   0.13.0           py37ha925a31_0
...
python                    3.7.5                h8c8aaf0_0
...

我尝试了其他版本的pyarrow,但结果相同。

conda -V
conda 4.7.12

2 个答案:

答案 0 :(得分:0)

最下面的底线, 我有同样的错误。这是我的解决方案:

!pip install pyarrow==0.13.0

我不确定这是否仅限于Windows 10,最近几天我在AWS Sagemaker中遇到了相同的错误。在以前的Sagemaker实例上,此方法运行良好。

使用Jupyter中的Conda Packages菜单,conda_python3内核显示它已从https://repo.anaconda.com/pkgs/main/linux-64安装了pyarrow 0.13.0,并构建py36he6710b0_0。

但是随后调用

!conda -list

即使重新启动内核,也没有显示pyarrow在Jupyter conda_python3内核中。

通常在Sagemaker [Jupyter笔记本电脑]实例中,我会使用!pip命令,因为它们似乎工作得更好,并且没有Conda Packages菜单有时会出现的超时错误。 (此外,我不必担心传递-y标志,安装就可以了)

通常!pip install pyarrow正常工作,但是我注意到它从2019年11月1日开始安装 pyarrow 0.15.1

在该版本中,也许加载_orc软件包或其他冲突的库时出错。

我的直觉是,conda版本的pyarrow 0.13.0和pyarrow 0.15.1出了点问题。

在Jupyter单元中,我尝试了此操作:

!pip uninstall pyarrow -y
!pip install pyarrow
from pyarrow import orc

输出:

Uninstalling pyarrow-0.15.1:
  Successfully uninstalled pyarrow-0.15.1
Collecting pyarrow
  Downloading https://files.pythonhosted.org/packages/6c/32/ce1926f05679ea5448fd3b98fbd9419d8c7a65f87d1a12ee5fb9577e3a8e/pyarrow-0.15.1-cp36-cp36m-manylinux2010_x86_64.whl (59.2MB)
     |████████████████████████████████| 59.2MB 381kB/s  eta 0:00:01
Requirement already satisfied: numpy>=1.14 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pyarrow) (1.14.3)
Requirement already satisfied: six>=1.0.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pyarrow) (1.11.0)
Installing collected packages: pyarrow
Successfully installed pyarrow-0.15.1
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-6-36378dee5a25> in <module>()
      1 get_ipython().system('pip uninstall pyarrow -y')
      2 get_ipython().system('pip install pyarrow')
----> 3 from pyarrow import orc

~/anaconda3/envs/python3/lib/python3.6/site-packages/pyarrow/orc.py in <module>()
     23 from pyarrow import types
     24 from pyarrow.lib import Schema
---> 25 import pyarrow._orc as _orc
     26 
     27 

ModuleNotFoundError: No module named 'pyarrow._orc'

请注意,当您尝试卸载pyarrow 0.15.1并安装特定的旧版本(例如0.13.0)时,应在卸载后重新启动内核。有一些不兼容的二进制文件被遗忘了。  我没有发布该输出,因为它很长。

pip uninstall pyarrow -y

重新启动内核,然后:

!pip install pyarrow==0.13.0
from pyarrow import orc

输出:

Collecting pyarrow==0.13.0
  Using cached https://files.pythonhosted.org/packages/ad/25/094b122d828d24b58202712a74e661e36cd551ca62d331e388ff68bae91d/pyarrow-0.13.0-cp36-cp36m-manylinux1_x86_64.whl
Requirement already satisfied: numpy>=1.14 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pyarrow==0.13.0) (1.14.3)
Requirement already satisfied: six>=1.0.0 in /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages (from pyarrow==0.13.0) (1.11.0)
Installing collected packages: pyarrow
Successfully installed pyarrow-0.13.0

导入命令现在没有错误,并且可以再次读取orc文件。

答案 1 :(得分:0)

Windows完全不支持ORC阅读器,据我所知从来没有。尚无法使用Visual Studio C ++编译器构建C ++中的Apache ORC。