我正试图通过jupyter在spark中运行我的第一个单词计数应用程序。但是我在SparkContext的初始化中遇到错误。
from pyspark import SparkContext, SparkConf
conf = SparkConf().setAppName("Spark Count")
sc = SparkContext(conf=conf)
以下是错误:
ValueError Traceback (most recent call last)
<ipython-input-13-6b825dbb354c> in <module>()
----> 1 sc = SparkContext(conf=conf)
/home/master/Desktop/Apps/spark-2.1.0-bin-hadoop2.7/python/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
113 """
114 self._callsite = first_spark_call() or CallSite(None, None, None)
--> 115 SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
116 try:
117 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
/home/master/Desktop/Apps/spark-2.1.0-bin-hadoop2.7/python/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
270 " created by %s at %s:%s "
271 % (currentAppName, currentMaster,
--> 272 callsite.function, callsite.file, callsite.linenum))
273 else:
274 SparkContext._active_spark_context = instance
ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=PySparkShell, master=local[*]) created by <module> at /usr/local/lib/python3.3/site-packages/IPython/utils/py3compat.py:186
答案 0 :(得分:0)
我认为你已经有了一个由Jupyter自动创建的SparkContext对象。你不应该创建一个新的。
只需在单元格中输入sc并执行即可。它应该显示对现有上下文的引用
希望有所帮助!
答案 1 :(得分:0)
事实上,错误已经表明了这一点:
ValueError:无法一次运行多个SparkContexts