我试图通过jupyter在spark中运行我的第一个单词计数应用程序。但我在SparkContext的初始化中遇到错误

时间:2017-02-11 23:49:04

标签: apache-spark jupyter-notebook

我正试图通过jupyter在spark中运行我的第一个单词计数应用程序。但是我在SparkContext的初始化中遇到错误。

from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName("Spark Count")
sc = SparkContext(conf=conf)

以下是错误:

ValueError                                Traceback (most recent call last)
<ipython-input-13-6b825dbb354c> in <module>()
----> 1 sc = SparkContext(conf=conf)

/home/master/Desktop/Apps/spark-2.1.0-bin-hadoop2.7/python/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    113         """
    114         self._callsite = first_spark_call() or CallSite(None, None, None)
--> 115         SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    116         try:
    117             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/home/master/Desktop/Apps/spark-2.1.0-bin-hadoop2.7/python/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    270                         " created by %s at %s:%s "
    271                         % (currentAppName, currentMaster,
--> 272                             callsite.function, callsite.file, callsite.linenum))
    273                 else:
    274                     SparkContext._active_spark_context = instance

ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=PySparkShell, master=local[*]) created by <module> at /usr/local/lib/python3.3/site-packages/IPython/utils/py3compat.py:186

2 个答案:

答案 0 :(得分:0)

我认为你已经有了一个由Jupyter自动创建的SparkContext对象。你不应该创建一个新的。

只需在单元格中输入sc并执行即可。它应该显示对现有上下文的引用

希望有所帮助!

答案 1 :(得分:0)

事实上,错误已经表明了这一点:

  

ValueError:无法一次运行多个SparkContexts