我正在尝试使用Bokeh以交互方式选择Jupyter笔记本中的数据区域。选择数据后,将使用Python在笔记本中的后续单元格中进一步操作。
以下代码将在Jupyter Notebook中生成一个图。然后,用户可以使用LassoSelectTool(或其他选择工具)选择数据区域。
import numpy as np
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.models import ColumnDataSource
# Direct output to this notebook
output_notebook()
n = 100
x = np.random.random(size=n) * 100
y = np.random.random(size=n) * 100
source = ColumnDataSource(data=dict(x=x, y=y))
figkwds = dict(plot_width=400, plot_height=300, webgl=True,
tools="pan,lasso_select,box_select,help",
active_drag="lasso_select")
p1 = figure(**figkwds)
p1.scatter('x', 'y', source=source, alpha=0.8)
show(p1)
如何访问后续Jupyter单元格中的选定数据?文档建议CustomJS与选择进行交互,但是我只能更新其他Bokeh图。我不知道如何从图中获取所选数据以进行更严格的操作。
答案 0 :(得分:1)
答案似乎是使用kernel.execute
从Bokeh javascript返回Jupyter内核。我根据Github issue开发了以下代码:
import numpy as np
from bokeh.plotting import gridplot, figure, show
from bokeh.io import output_notebook, push_notebook
from bokeh.models import ColumnDataSource, CustomJS
# Direct output to this notebook
output_notebook()
# Create some data
n = 100
source = ColumnDataSource(data=dict(
x=np.random.random(size=n) * 100,
y=np.random.random(size=n) * 100)
)
model = ColumnDataSource(data=dict(
x=[],
y_obs=[],
y_pred=[],
))
# Create a callback with a kernel.execute to return to Jupyter
source.callback = CustomJS(code="""
// Define a callback to capture errors on the Python side
function callback(msg){
console.log("Python callback returned unexpected message:", msg)
}
callbacks = {iopub: {output: callback}};
// Select the data
var inds = cb_obj.selected['1d'].indices;
var d1 = cb_obj.data;
var x = []
var y = []
for (i = 0; i < inds.length; i++) {
x.push(d1['x'][inds[i]])
y.push(d1['y'][inds[i]])
}
// Generate a command to execute in Python
data = {
'x': x,
'y': y,
}
var data_str = JSON.stringify(data)
var cmd = "saved_selected(" + data_str + ")"
// Execute the command on the Python kernel
var kernel = IPython.notebook.kernel;
kernel.execute(cmd, callbacks, {silent : false});
""")
selected = dict()
def saved_selected(values):
x = np.array(values['x'])
y_obs = np.array(values['y'])
# Sort by increasing x
sorted_indices = x.argsort()
x = x[sorted_indices]
y_obs = y_obs[sorted_indices]
if len(x) > 2:
# Do a simple linear model
A = np.vstack([x, np.ones(len(x))]).T
m, c = np.linalg.lstsq(A, y_obs)[0]
y_pred = m * x + c
data = {'x': x, 'y_obs': y_obs, 'y_pred': y_pred}
model.data.update(data)
# Update the selected dict for further manipulation
selected.update(data)
# Update the drawing
push_notebook(handle=handle)
figkwds = dict(plot_width=500, plot_height=300, # webgl=True,
x_axis_label='X', y_axis_label='Y',
tools="pan,lasso_select,box_select,reset,help")
p1 = figure(active_drag="lasso_select", **figkwds)
p1.scatter('x', 'y', source=source, alpha=0.8)
p2 = figure(**figkwds,
x_axis_type='log', x_range=[1, 100],
y_axis_type='log', y_range=[1, 100])
p2.scatter('x', 'y', source=source, alpha=0.8)
p3 = figure(plot_width=500, plot_height=300, # webgl=True,
x_axis_label='X', y_axis_label='Y',
tools="pan,reset,help")
p3.scatter('x', 'y', source=source, alpha=0.6)
p3.scatter('x', 'y_obs', source=model, alpha=0.8, color='red')
p3.line('x', 'y_pred', source=model)
layout = gridplot([[p1], [p2], [p3]])
handle = show(layout, notebook_handle=True)
此代码目前在此处以Jupyter笔记本的形式提供:http://nbviewer.jupyter.org/github/arkottke/notebooks/blob/0b9fc5bac0de573005c84f6c2493bb4da59a103f/bokeh_selector_example.ipynb