Jupyter笔记本失败,出现“内核未响应”

时间:2018-11-30 10:30:57

标签: python jupyter-notebook conda

我遇到一个与Jupyter笔记本(Python 3内核)的顺序执行相关的奇怪错误。主循环通过nbconvert

依次执行以下一组笔记本
[...]
from nbconvert.preprocessors import ExecutePreprocessor
[...]
class Report:
    [..]

    def execute_notebook(self, timeout=3600):
        [...]
        notebook = nbformat.read(str(self.notebook_path), as_version=4)
        kernel_name = notebook["metadata"]["kernelspec"]["name"]
        ep = ExecutePreprocessor(timeout=timeout, kernel_name=kernel_name)

每天执行一次执行。有时,笔记本上的循环工作正常,但另一些程序在尝试使用以下命令执行第二个笔记本时会失败

Traceback (most recent call last):
  File "/home/data/ds-metrics/scripts/recurring_reports.py", line 52, in main
    report.execute_notebook()
  File "/home/data/miniconda3/envs/analytics/lib/python3.6/site-packages/report/__init__.py", line 49, in execute_notebook
    ep.preprocess(notebook, dict(metadata=dict(path=self.notebook_folder)))
  File "/home/data/miniconda3/envs/analytics/lib/python3.6/site-packages/nbconvert/preprocessors/execute.py", line 359, in preprocess
    with self.setup_preprocessor(nb, resources, km=km):
  File "/home/data/miniconda3/envs/analytics/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/data/miniconda3/envs/analytics/lib/python3.6/site-packages/nbconvert/preprocessors/execute.py", line 304, in setup_preprocessor
    self.km, self.kc = self.start_new_kernel(cwd=path)
  File "/home/data/miniconda3/envs/analytics/lib/python3.6/site-packages/nbconvert/preprocessors/execute.py", line 258, in start_new_kernel
    kc.wait_for_ready(timeout=self.startup_timeout)
  File "/home/data/miniconda3/envs/analytics/lib/python3.6/site-packages/jupyter_client/blocking/client.py", line 124, in wait_for_ready
    raise RuntimeError("Kernel didn't respond in %d seconds" % timeout)
RuntimeError: Kernel didn't respond in 60 seconds

失败后,如果我再次运行循环,它将起作用。它“看起来”确实是随机的。

执行是在conda虚拟环境中的Debian服务器上使用Python 3.6.6和以下相关软件包进行的

ipykernel                 5.1.0           py36h24bf2e0_1001    conda-forge
ipython                   7.1.1           py36h24bf2e0_1000    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
jupyter                   1.0.0                      py_1    conda-forge
jupyter_client            5.2.3                      py_1    conda-forge
jupyter_console           6.0.0                      py_0    conda-forge
jupyter_core              4.4.0                      py_0    conda-forge
nbconvert                 5.4.0                         1    conda-forge
nbformat                  4.4.0                      py_1    conda-forge
notebook                  5.7.2                 py36_1000    conda-forge
pexpect                   4.6.0                 py36_1000    conda-forge
python                    3.6.6                h5001a0f_3    conda-forge
pyzmq                     17.1.2           py36hae99301_1    conda-forge
traitlets                 4.3.2                 py36_1000    conda-forge
zeromq                    4.2.5                hfc679d8_6    conda-forge

感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

经过一番挖掘,我发现可以将问题减少到以下最少的代码

from nbconvert.preprocessors import ExecutePreprocessor

ep = ExecutePreprocessor(kernel_name="python3")
km, kc = ep.start_new_kernel()
km.shutdown_kernel()

在云服务器上,有时脚本被卡在{strace中)

open("/dev/random", O_RDONLY)           = 6
poll([{fd=6, events=POLLIN}], 1, -1

很显然,服务器在启动内核之间的熵不足,导致等待时间长。指出了类似的问题here

我遵循了这个想法,脚本现在运行顺利。我猜想,在原始脚本中,我可以重用第一个内核,而不必每次都启动一个新内核,从而避免了对随机数生成器的调用。