停止群集后,无法在Google Cloud Dataproc群集上重新打开Jupyter笔记本

时间:2017-04-14 14:22:39

标签: jupyter-notebook jupyter google-cloud-dataproc

我使用Google Cloud Dataproc运行Jupyter笔记本(按照以下说明操作:https://cloud.google.com/dataproc/docs/tutorials/jupyter-notebook)。

我运行了一个笔记本,保存了它,然后在某个时间点,停止了集群(使用GUI)。然后我重新启动了集群并尝试使用相同的指令再次运行Jupyter笔记本,但在最后一步,当我尝试在Chrome中打开Jupyter时,我得到:

"This site can't be reached. The webpage at http://<my-cluster-name>:8123/ might be temporarily down or it may have moved permanently to a new web address. ERR_SOCKS_CONNECTION_FAILED." 

另外(我不知道这是否有帮助)在我配置浏览器的终端窗口中,我有一条消息:

ERROR:child_thread_impl.cc(762)] Request for unknown Channel-associated interface: ui::mojom::GpuMain  
Google Chrome[695:8548] NSWindow warning: adding an unknown subview: <FullSizeContentView: 0x7fdfd3e291e0>. Break on NSLog to debug.  
Google Chrome[695:8548] Call stack:
(
"+callStackSymbols disabled for performance reasons"
)

在我ssh-ed到我的群集的终端窗口中,我有以下消息:

channel 3: open failed: connect failed: Connection refused  
channel 4: open failed: connect failed: Connection refused  
channel 5: open failed: connect failed: Connection refused    
channel 6: open failed: connect failed: Connection refused   
channel 12: open failed: connect failed: Connection refused   
channel 12: open failed: administratively prohibited: open failed  
channel 13: open failed: administratively prohibited: open failed  
channel 14: open failed: administratively prohibited: open failed  
channel 14: open failed: connect failed: Connection refused  
channel 8: open failed: connect failed: Connection refused  

此外,早在我停止群集之前,我可以关闭jupyter笔记本,断开群集连接,然后重新打开jupyter笔记本。我停止群集后才遇到这个问题。有什么想法可能会发生什么?

2 个答案:

答案 0 :(得分:2)

这是因为当前initialization action显式启动了调用launch-jupyter-kernel.sh的jupyter笔记本服务。初始化操作与GCE启动脚本不同,因为它们不会在启动时重新运行;意图通常是初始化操作不需要是幂等的,但是如果他们想要在启动时重新启动,则需要添加一些init.d / systemd配置来明确地执行此操作。

对于一次性案例,您可以通过SSH连接到主服务器,然后执行:

sudo su
source /etc/profile.d/conda.sh
nohup jupyter notebook --allow-root --no-browser >> /var/log/jupyter_notebook.log 2>&1 &

如果您希望在启动时自动执行此操作,您可以尝试将其放在startup script via GCE metadata中,但如果您在群集创建时执行此操作,则需要确保它不会发生冲突使用Dataproc初始化操作(同样,启动脚本可能在dataproc init操作之前运行,因此您可能只希望允许静默失败的第一次尝试。)

从长远来看,我们应该更新初始化操作以将条目添加到init.d / systemd中,以便init操作本身在重新启动时配置自动重启。目前没有人专门讨论这个问题,但如果你或你认识的任何人都能胜任这项任务,那么贡献总会得到很好的赞赏;我提交了https://github.com/GoogleCloudPlatform/dataproc-initialization-actions/issues/108来跟踪此功能。

答案 1 :(得分:2)

我通过使用ssh连接到主机来修复了问题,并创建了一个systemd服务(上面是dennis-huo的评论)。

  1. 转至/ usr / lib / systemd / system
  2. sudo su
  3. 创建一个名为“jupyter-notebook.service”的系统单元文件,内容为

    [Unit]
    Description=Start Jupyter Notebook Server at reboot
    
    [Service]
    Type=simple
    ExecStart=/opt/conda/bin/jupyter notebook --allow-root  --no-browser
    
    [Install]
    WantedBy=multi-user.target
    
  4. systemctl daemon-reload

  5. systemctl enable jupyter-notebook.service
  6. systemctl start jupyter-notebook.service
  7. 下一步将上面的代码包含在dataproc-initialization-actions中。 希望有所帮助。