如何使用bokeh vbar图表参数与groupby对象?

时间:2017-09-21 12:02:30

标签: python bar-chart data-visualization bokeh

问题

下面的代码是散景文档中的vbar chart示例。 在这个例子中我无法理解。

  
      
  1. 其中'cyl_mfr'来自factor_cmap()和vbar()?

  2.   
  3. 'mpg_mean',是否计算'mpg'列的平均值?如果那样的话   为什么'mpg_sum'不起作用?

  4.   

我想像这个例子一样创建自己的vbar图表。

代码

from bokeh.io import show, output_file
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.palettes import Spectral5
from bokeh.sampledata.autompg import autompg_clean as df
from bokeh.transform import factor_cmap

output_file("bars.html")

df.cyl = df.cyl.astype(str)
df.yr = df.yr.astype(str)

group = df.groupby(('cyl', 'mfr'))

source = ColumnDataSource(group)
index_cmap = factor_cmap('cyl_mfr', palette=Spectral5, 
factors=sorted(df.cyl.unique()), end=1)

p = figure(plot_width=800, plot_height=300, title="Mean MPG by # Cylinders 
           and Manufacturer",
           x_range=group, toolbar_location=None, tools="")

p.vbar(x='cyl_mfr', top='mpg_mean', width=1, source=source,
       line_color="white", fill_color=index_cmap, )

p.y_range.start = 0
p.x_range.range_padding = 0.05
p.xgrid.grid_line_color = None
p.xaxis.axis_label = "Manufacturer grouped by # Cylinders"
p.xaxis.major_label_orientation = 1.2
p.outline_line_color = None

p.add_tools(HoverTool(tooltips=[("MPG", "@mpg_mean"), ("Cyl, Mfr", 
            "@cyl_mfr")]))

show(p)

1 个答案:

答案 0 :(得分:0)

group = df.groupby(('cyl', 'mfr'))成了<pandas.core.groupby.DataFrameGroupBy object at 0x0xxx>。如果你将它传递给ColumnDataSource,那么散景会有很多魔力,并且已经计算了很多统计数据

df.columns
Index(['mpg', 'cyl', 'displ', 'hp', 'weight', 'accel', 'yr', 'origin', 'name', 'mfr'],
source.column_names
  

[&#39; accel_count&#39;,&#39; accel_mean&#39;,&#39; accel_std&#39;,&#39; accel_min&#39;,   &#39; accel_25%&#39;,&#39; accel_50%&#39;,&#39; accel_75%&#39;,&#39; accel_max&#39;,&#39; displ_count&#39; ,   &#39; displ_mean&#39;,&#39; displ_std&#39;,&#39; displ_min&#39;,&#39; displ_25%&#39;,&#39; displ_50%&#39;,   &#39; displ_75%&#39;,&#39; displ_max&#39;,&#39; hp_count&#39;,&#39; hp_mean&#39;,&#39; hp_std&#39;,   &#39; hp_min&#39;,&#39; hp_25%&#39;,&#39; hp_50%&#39;,&#39; hp_75%&#39;,&#39; hp_max&#39; ,&#39; mpg_count&#39;,   &#39; mpg_mean&#39;,&#39; mpg_std&#39;,&#39; mpg_min&#39;,&#39; mpg_25%&#39;,&#39; mpg_50%&#39;,   &#39; mpg_75%&#39;,&#39; mpg_max&#39;,&#39; weight_count&#39;,&#39; weight_mean&#39;,&#39; weight_std&#39;,   &#39; weight_min&#39;,&#39; weight_25%&#39;,&#39; weight_50%&#39;,&#39; weight_75%&#39;,   &#39; weight_max&#39;,&#39; yr_count&#39;,&#39; yr_mean&#39;,&#39; yr_std&#39;,&#39; yr_min&#39;,   &#39; yr_25%&#39;,&#39; yr_50%&#39;,&#39; yr_75%&#39;,&#39; yr_max&#39;,&#39; cyl_mfr&#39; ]

  1. cyl_mfr是您按连接分组的2列的标签。在source中,这已成为元组列

  2. mpg_sum未计算。如果你不能得到这笔钱,你需要自己计算一下。