在pyspark中使用findspark添加软件包

时间:2018-07-17 20:27:31

标签: apache-spark pyspark

我正在使用findspark软件包在笔记本中添加软件包。是否有我收到此错误的原因(我使用的是Spark 2.3.0)

import findspark
findspark.add_packages(["org.apache.spark:spark-streaming-kafka-0-8_2.11:2.1.0"])
KeyErrorTraceback (most recent call last)
<ipython-input-2-94ec2e600525> in <module>()
      2 import sys
      3 import findspark
----> 4 findspark.add_packages(["org.apache.spark:spark-streaming-kafka-0-8_2.11:2.1.0"])
      5 #findspark.add_packages(["Azure:mmlspark:0.13"])

/home/narjunan/anaconda/envs/sparkpy27/lib/python2.7/site-packages/findspark.pyc in add_packages(packages)
    155         packages = [packages]
    156 
--> 157     os.environ["PYSPARK_SUBMIT_ARGS"] += " --packages "+ ",".join(packages)  +" pyspark-shell"
    158 
    159 def add_jars(jars):

/home/narjunan/anaconda/envs/sparkpy27/lib/python2.7/UserDict.pyc in __getitem__(self, key)
     38         if hasattr(self.__class__, "__missing__"):
     39             return self.__class__.__missing__(self, key)
---> 40         raise KeyError(key)
     41     def __setitem__(self, key, item): self.data[key] = item
     42     def __delitem__(self, key): del self.data[key]

KeyError: 'PYSPARK_SUBMIT_ARGS'

0 个答案:

没有答案