在Python 3.8上导入pyspark时出现“ TypeError:需要整数(获取类型字节)”

时间:2020-02-17 17:15:33

标签: apache-spark pyspark python-3.8

  1. 创建了一个conda环境:
conda create -y -n py38 python=3.8
conda activate py38
  1. 从Pip安装了Spark:
pip install pyspark
# Successfully installed py4j-0.10.7 pyspark-2.4.5
  1. 尝试导入pyspark:
python -c "import pyspark"

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/__init__.py", line 51, in <module>
    from pyspark.context import SparkContext
  File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/context.py", line 31, in <module>
    from pyspark import accumulators
  File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/accumulators.py", line 97, in <module>
    from pyspark.serializers import read_int, PickleSerializer
  File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/serializers.py", line 72, in <module>
    from pyspark import cloudpickle
  File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 145, in <module>
    _cell_set_template_code = _make_cell_set_template_code()
  File "/Users/dmitrii_deriabin/anaconda3/envs/py38/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code
    return types.CodeType(
TypeError: an integer is required (got type bytes)


似乎Pyspark随附了cloudpickle软件包的预打包版本,该版本在Python 3.8上有一些问题,现在在pip版本上已解决(至少从1.3.0版开始),但是Pyspark版本是还是坏了。有没有人遇到同样的问题/有没有解决这个问题的运气?

3 个答案:

答案 0 :(得分:7)

您必须将python版本从3.8降级到3.7,因为pyspark不支持此python版本。

答案 1 :(得分:0)

我刚刚确认(2020-11-04)升级到pyspark == 3.0.1可以解决问题。

答案 2 :(得分:-1)

最新的dev软件包应解决此问题:

pip install https://github.com/pyinstaller/pyinstaller/archive/develop.tar.gz

对话:https://github.com/pyinstaller/pyinstaller/issues/4265#issuecomment-546221741