使用pyspark与Jupyter时出错

时间:2017-02-04 20:54:09

标签: apache-spark pyspark jupyter-notebook jupyter

我按照this website上的说明进行操作,但每次打开一个新的pyspark笔记本时,我仍然会收到以下内核错误。我该如何解决这个问题?

[E 15:39:28.693 NotebookApp] Failed to run command:
[u'/anaconda/bin/python', u'-m', u'ipykernel', u'-f', u'/run/user/1000/jupyter/kernel-f04c7a43-accb-403b-9632-d47e6728387e.json']
    PATH='/home/username/anaconda2/bin:/srv/spark/bin:/usr/local/scala/bin:/home/username/anaconda2/bin:/home/username/anaconda2/bin:/srv/spark/bin:/home/username/bin:/home/username/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin'
    with kwargs:
{'cwd': u'/home/username', 'stdin': -1, 'preexec_fn': <function <lambda> at 0x7f7280b3c320>, 'stderr': None, 'stdout': None}

 [E 15:39:28.712 NotebookApp] Unhandled error in API request
Traceback (most recent call last):
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/base/handlers.py", line 457, in wrapper
    result = yield gen.maybe_future(method(self, *args, **kwargs))
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1021, in run
    yielded = self.gen.throw(*exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/sessions/handlers.py", line 62, in post
    kernel_id=kernel_id))
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1021, in run
    yielded = self.gen.throw(*exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/sessions/sessionmanager.py", line 79, in create_session
    kernel_name)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1021, in run
    yielded = self.gen.throw(*exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/sessions/sessionmanager.py", line 92, in start_kernel_for_session
    self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 285, in wrapper
    yielded = next(result)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/kernels/kernelmanager.py", line 87, in start_kernel
    super(MappingKernelManager, self).start_kernel(**kwargs)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/multikernelmanager.py", line 110, in start_kernel
    km.start_kernel(**kwargs)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/manager.py", line 243, in start_kernel
    **kw)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/manager.py", line 189, in _launch_kernel
    return launch_kernel(kernel_cmd, **kw)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/launcher.py", line 123, in launch_kernel
    proc = Popen(cmd, **kwargs)
  File "/home/username/anaconda2/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/home/username/anaconda2/lib/python2.7/subprocess.py", line 1343, in _execute_child
    raise child_exception

1 个答案:

答案 0 :(得分:1)

我不确定你从哪里获得该网站,但让jupyter工作比这更容易。您需要做的就是设置环境变量$("button").click(function(){ $("#div1").load("otherpage.php"); }); PYSPARK_DRIVER_PYTHON=jupyter,然后运行pyspark。实际上有一些方法可以嵌入到位于spark / bin的pyspark命令中。

如果您在群集上运行PySpark并需要从联网计算机访问您的笔记本,请务必在PYSPARK_DRIVER_PYTHON_OPTS='notebook'字符串中添加ipport值。像这样:

PYSPARK_DRIVER_PYTHON_OPTS

然后你可以打开浏览器并输入export PYSPARK_DRIVER_PYTHON_OPTS='notebook --ip=0.0.0.0 --port=8899'(其中计算机名称是你启动pyspark的盒子的名称),你会找到你的笔记本。