Question

问题：我想通过Jupyter API与另一个应用程序中的Jupyter进行交互，特别是我至少希望从该应用程序运行笔记本（对我来说，完美的变体是在运行之前编辑一些段落）。我已经读过API documentation，但没有找到我需要的东西。

为此，我使用了Apache Zeppelin，它们具有相同的结构（笔记本和段落）。

有人出于我刚刚描述的目的使用Jupyter吗？

Answer 1

忽略使用Jupyter API是否是解决该问题的最佳解决方案（问题中没有明确描述），下面的代码即可满足您的要求：它将通过http远程执行Jupyter笔记本并获得一些结果。它还没有做好生产准备，而是一个如何完成生产的示例。并未使用会产生大量输出的单元进行测试-认为它需要进行调整。

您还可以通过更改代码数组以编程方式更改/编辑代码。

您将需要根据您的配置更改notebook_path，base和header，有关详细信息，请参见代码。

import json
import requests
import datetime
import uuid
from pprint import pprint
from websocket import create_connection

# The token is written on stdout when you start the notebook
notebook_path = '/Untitled.ipynb'
base = 'http://localhost:9999'
headers = {'Authorization': 'Token 4a72cb6f71e0f05a6aa931a5e0ec70109099ed0c35f1d840'}

url = base + '/api/kernels'
response = requests.post(url,headers=headers)
kernel = json.loads(response.text)

# Load the notebook and get the code of each cell
url = base + '/api/contents' + notebook_path
response = requests.get(url,headers=headers)
file = json.loads(response.text)
code = [ c['source'] for c in file['content']['cells'] if len(c['source'])>0 ]

# Execution request/reply is done on websockets channels
ws = create_connection("ws://localhost:9999/api/kernels/"+kernel["id"]+"/channels",
     header=headers)

def send_execute_request(code):
    msg_type = 'execute_request';
    content = { 'code' : code, 'silent':False }
    hdr = { 'msg_id' : uuid.uuid1().hex, 
        'username': 'test', 
        'session': uuid.uuid1().hex, 
        'data': datetime.datetime.now().isoformat(),
        'msg_type': msg_type,
        'version' : '5.0' }
    msg = { 'header': hdr, 'parent_header': hdr, 
        'metadata': {},
        'content': content }
    return msg

for c in code:
    ws.send(json.dumps(send_execute_request(c)))

# We ignore all the other messages, we just get the code execution output
# (this needs to be improved for production to take into account errors, large cell output, images, etc.)
for i in range(0, len(code)):
    msg_type = '';
    while msg_type != "stream":
        rsp = json.loads(ws.recv())
        msg_type = rsp["msg_type"]
    print(rsp["content"]["text"])

ws.close()

基于此代码的有用链接（建议阅读更多信息）

请注意，这里还有https://jupyter-client.readthedocs.io/en/stable/index.html，但据我所知它不支持HTTP作为传输。

作为参考，该功能适用于笔记本5.7.4，不确定其他版本。

Answer 2

我认为，在您的情况下，使用远程Jupyter Notebook的工作量过大。

我看到了一种很好的方法，即通过良好的日志记录将必需的参数传递给python程序。

Answer 3

我没有足够的声誉来发表评论，但是（对我来说）如果笔记本中存在降价单元，则可接受的答案将失败（陷入无限循环）。调整代码

code = [ c['source'] for c in file['content']['cells'] if len(c['source'])>0 ]

到

code = [ c['source'] for c in file['content']['cells'] if len(c['source'])>0 and c['cell_type']=='code' ]

为我修复了此问题

通过API与Jupyter Notebook互动

3 个答案: