如何将摘要统计信息和原始数据的图表与散景图连接起来

时间:2016-05-06 13:00:38

标签: python bokeh jupyter-notebook

我有一个数据集,我正在绘制汇总统计数据,我希望能够选择一个有趣的点并在单独的子图中绘制基础原始数据。下面的代码可以从一个jupyter笔记本运行,看看输出应该是什么样子。理想情况下,我可以在第一个图中单击(1,10)附近的点,并在第二个图中看到聚集在10左右的原始数据。

底部附近注释掉的代码是我尝试定义回调,但我认为我对taptools和回调应该如何工作存在根本的误解。如何告诉散景它应该使用特定参数调用特定的python例程(基于我点击的点),然后重新加载第二个数字?

from string import ascii_lowercase

import numpy as np
import pandas as pd

from bokeh.plotting import figure, output_notebook, show, gridplot
from bokeh.models import ColumnDataSource, widgets, HoverTool, TapTool

output_notebook()

class RawPlot():
    def __init__(self, fig, col, raw):
        self.fig = fig
        self.raw = raw
        self.circ = self.fig.circle(x='index', y='raw', size=1,
                                    source=ColumnDataSource(raw[col].reset_index()))
    # ideally I would have a callback to RawPlot.update to change the underlying data
    def update(self, col):
        self.circ.data_source = ColumnDataSource(self.raw[col].reset_index())

# generate the example data
rawdat = pd.DataFrame(np.random.random((1024, 4)) + np.arange(4)*10, 
                      columns=pd.MultiIndex.from_tuples([(s, 'raw') for s in ascii_lowercase[:4]]))
# compute summary statistics and show in a bokeh figure
stats = rawdat.describe().T.reset_index()
pstat = figure()
pstat.circle(x='index', y='mean', source=ColumnDataSource(stats), size=12)

# show the raw data of the first column
praw = figure()
rawplot = RawPlot(praw, 'a', rawdat)
# this was my attempt at being able to change which column's raw data was plotted. It failed
# taptool = pstat.select(type=TapTool)
# taptool.callback = rawplot.update("@level_0")
show(gridplot([[pstat, praw]]))

1 个答案:

答案 0 :(得分:0)

下面的代码以我想要的方式处理交互。它与散景服务器一起运行,足以满足测试用例。从笔记本运行代码并不具有相同的交互性(选择pstat中的数据不会更新praw中的数据)。因此,output_notebookpush_notebook已被注释掉。

from string import ascii_lowercase

import numpy as np
import pandas as pd

from bokeh.plotting import figure, show, gridplot
from bokeh.models import ColumnDataSource, widgets, HoverTool, TapTool, BoxSelectTool
from bokeh.io import output_notebook, push_notebook, output_server
from bokeh.resources import INLINE

# generate the example data
rawdat = pd.DataFrame(np.random.random((1024, 4)) + np.arange(4)*10, 
                      columns=pd.MultiIndex.from_tuples([(s, 'raw') for s in ascii_lowercase[:4]]))
# compute summary statistics and show in a bokeh figure
stats = rawdat.describe().T.reset_index()
statsource = ColumnDataSource(stats[['mean']])

TOOLS = [TapTool(), BoxSelectTool()]
pstat = figure(tools=TOOLS)
pstat.circle(x='index', y='mean', source=statsource, size=12)

# show the raw data of the first column
praw = figure()
col = rawdat.columns.levels[0][0]
rawsource = ColumnDataSource(rawdat[col].reset_index())
circ = praw.circle(x='index', y='raw', size=1,
                   source=rawsource)

# update the raw data_source when a new summary is chosen
def update(attr, old, new):
    ind = new['1d']['indices'][0]
    col = rawdat.columns.levels[0][ind]
    rawsource.data['raw'] = rawdat[col].values.ravel()
    # push_notebook()
statsource.on_change('selected', update)

# serve the figures
output_server("foo")
# output_notebook()
show(gridplot([[pstat, praw]]))