使用带有pandas数据帧的groupy和subplots

时间:2016-06-07 01:28:47

标签: python pandas group-by boxplot subplot

我有一个数据框对象,其中包含多列中的时间序列数据(见下文)。我正在尝试为数据框中的每个列创建一个包含子图的图形,其中每个子图有12个箱图,每个月一个。

我使用以下代码只是之前从数据框中制作子图(但是对于条形图而不是箱图),

labels = df.columns.values
fig, axes = plt.subplots(nrows = 3, ncols = 4, gridspec_kw =  dict(hspace=0.3),figsize=(12,9), sharex = True, sharey=True)
targets = zip(labels, axes.flatten())
for i, (col,ax) in enumerate(targets):
    pd.DataFrame(df[col]).plot(kind='bar', ax=ax, color = 'green')

但是当我使用groupby对象代替dataframe

时,它不起作用
grouped = df.groupby(df.index.month)
labels = df.columns.values
fig, axes = plt.subplots(nrows = 3, ncols = 4)
targets = zip(labels, axes.flatten())
for i, (col,ax) in enumerate(targets):
    grouped[col].boxplot(ax=ax, color = 'green', subplots =False)

问题是无法在'SeriesGroupBy'

上调用boxplot

但即使我使用 df.plot.box(by = df.index.monthdf.boxplot(by = df.index.month) 直接在绘图循环中(代替首先单​​独创建分组对象),似乎无法识别分组。

有人有建议吗? 谢谢!

EDIT 示例数据:

               res01      res02      res03     res04      res05      res06
1981-01-31 -16.571927  -4.051575  -8.865433 -0.858423  41.831455 -14.569453   
1981-02-28 -14.672908  -2.004894  -6.151469 -0.448101 -30.476155 -13.572198   
1981-03-31 -10.588504  -1.079251  -3.057215 -0.897639 -19.407469  -6.936018   
1981-04-30 -18.132814  -1.438858   0.028866  0.388591 -24.435158  -8.880159   
1981-05-31  -8.190266  -2.175105  -4.326701 -1.089722 -13.286928 -13.530322   
1981-06-30  -7.857190  -2.861348  -5.046409 -0.013585 -17.134277 -18.153491   
1981-07-31  -0.882391  -4.497572  -9.914211 -1.115400 -27.628329 -33.412025   
1981-08-31  12.876021  -4.969259 -11.849937 -1.205588 -29.825922 -36.093600   
1981-09-30 -43.434015  -8.681070 -14.143496 -4.701924 -32.357578 -25.945754   
1981-10-31  38.656449   3.055204   3.088694  1.425666  12.881002  -7.261655   
1981-11-30  -3.455937  -2.136963  -4.393510  0.472263  10.560834 -11.224297   
1981-12-31  -2.923868  -2.006733  -1.667986 -0.460742  -8.663085 -12.022059   
1982-01-31  19.625548  -2.127550  -4.044511 -0.447382  27.524403  -8.551865   
1982-02-28 -12.424200  -1.931246  -6.055349 -0.448398 -29.979264 -13.166926   
1982-03-31  35.249772  -2.416680  -6.029210 -0.661215 -47.206552 -24.267880   
1982-04-30 -55.008877  -7.160744  -9.331341 -1.040474 -42.029073 -32.618620   
1982-05-31 -17.349030  -3.067463  -6.511664 -0.892260 -40.803273 -29.355429   
1982-06-30  -5.710025  -2.519162 -15.885825 -1.664557 -36.476341 -43.840351   
1982-07-31 -30.790685  -8.042895 -12.381517 -1.339010 -38.542642 -53.612233   
1982-08-31   4.263036   1.270455 -13.225027 -1.431894 -29.160338 -36.575128   
1982-09-30 -17.206044 -14.336086 -13.276423 -1.316164 -32.316961 -43.796818   
1982-10-31  -5.164960  -6.247522 -12.369959 -1.045498  12.716187 -29.489328   
1982-11-30 -25.543948  -2.648465  -5.598642 -0.554379  12.033847 -12.507718   
1982-12-31  -2.971802  -1.982072  -1.225803 -0.335575  -7.452425 -10.182204   
1983-01-31  29.917477  -3.224031  -7.680435 -0.701457  43.068696 -11.812835   
1983-02-28   4.998955  -3.281333 -12.630952 -0.867328 -47.758882 -30.902821   
1983-03-31 -21.483914  -3.219957  -7.321552 -0.756839 -50.798885 -29.858194   
1983-04-30 -23.288018  -2.411159  -5.212307 -0.626141 -49.477692 -22.813129   
1983-05-31   0.317828  -3.181573  -6.915676 -0.855810 -21.701865 -23.165239   
1983-06-30 -23.914567  -7.788987 -18.696691 -2.082176 -35.968441 -50.015002   
1983-07-31 -21.452370  -6.447321 -14.399266 -1.514856 -35.645412 -49.081801   
1983-08-31 -14.721837  -7.266818 -14.439923 -1.499819 -47.237557 -52.978016   
1983-09-30 -18.532760  -3.905781  -7.398113 -0.729630 -16.512127 -23.390976   
1983-10-31  62.864704  -5.903833 -13.910222 -1.143347  21.336868 -26.468803   
1983-11-30 -11.050188  -5.180171 -12.654286 -1.186503  24.885744 -22.581720   
1983-12-31  -9.576725  -6.114298  -7.761357 -1.048323 -23.590444 -37.646843   

2 个答案:

答案 0 :(得分:1)

AFAIK,如果你对你的DF进行分组,你可以应用一些聚合(减少)函数或调用.groups,它会返回一个带有组键和每个键的相应索引的dict。所以如果你只想绘制12个子图

IIUC你可以尝试这样做(使用seaborn模块):

ax = sns.boxplot(data=df, x=df.index.month, y='res01')

enter image description here

副区:

labels = df.columns.values
fig, axes = plt.subplots(nrows = 3, ncols = 4, gridspec_kw =  dict(hspace=0.3),figsize=(12,9), sharex = True, sharey=True)
targets = zip(labels, axes.flatten())
for i, (col,ax) in enumerate(targets):
    sns.boxplot(data=df, ax=ax, color='green', x=df.index.month, y=col)

enter image description here

PS我不确定我是否理解你的目标

答案 1 :(得分:1)

如果seaborn不是一个选项,那么创建一个额外的"月"列和分组将起作用:

df['month'] = df.index.month
df.boxplot(by='month', figsize=(12,8));

boxplotbymonth