我已在群集中安装了Datastax enterprise 4.6,但我无法弄清楚为什么pyspark会抛出此错误。 scala界面运行良好,但python没有。有没有人知道如何解决这个问题?
Python 2.6.6 Centos 6.5
干杯
bash-4.1$ dse pyspark --master spark://IP:7077
Python 2.6.6 (r266:84292, Jan 22 2014, 01:49:05)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
File "/usr/share/dse/spark/python/pyspark/shell.py", line 33, in <module>
import pyspark
File "/usr/share/dse/spark/python/pyspark/__init__.py", line 63, in <module>
from pyspark.context import SparkContext
File "/usr/share/dse/spark/python/pyspark/context.py", line 34, in <module>
from pyspark import rdd
File "/usr/share/dse/spark/python/pyspark/rdd.py", line 1972
return {convertColumnValue(v) for v in columnValue}
^
SyntaxError: invalid syntax
>>>
答案 0 :(得分:1)
DSE 4.6中包含的PySpark支持需要Python 2.7.x,并且会抛出您在Python 2.6.x上看到的错误。即将发布的补丁版本应该可以解决Python 2.6.x的问题。目前还没有具体的日期。