使用行和列多索引从数据框创建数据框框图

时间:2016-11-13 04:46:17

标签: pandas matplotlib

我有以下Pandas数据框,我试图创建一个" dur"的盒子图。由qdepth组织的客户端和服务器的值(x轴上的qdepth,y轴上的持续时间,有两个变量clientserver)。好像我需要让client和服务器as columns. I haven't been able to figure this out trying combinations of取消堆栈and reset_index`。

example

1 个答案:

答案 0 :(得分:4)

这里有一些我重新创建的虚拟数据,因为除了图片之外你没有发布你的数据:

qdepth,mode,runid,dur
1,client,0x1b7bd6ef955979b6e4c109b47690c862,7.0
1,client,0x45654ba030787e511a7f0f0be2db21d1,30.0
1,server,0xb760550f302d824630f930e3487b4444,19.0
1,server,0x7a044242aec034c44e01f1f339610916,95.0
2,client,0x51c88822b28dfa006bf38603d74f9911,15.0
2,client,0xd5a9028fddf9a400fd8513edbdc58de0,49.0
2,server,0x3943710e587e3932adda1cad8eaf2aeb,30.0
2,server,0xd67650fd984a48f2070de426e0a942b0,93.0

加载数据:df = pd.read_clipboard(sep=',', index_col=[0,1,2])

选项1:

df.unstack(level=1).boxplot()

enter image description here

选项2:

df.unstack(level=[0,1]).boxplot()

Option 2

选项3:

使用seaborn:

import seaborn as sns
sns.boxplot(x="qdepth", hue="mode", y="dur", data=df.reset_index(),)

enter image description here

<强>更新

要回答您的评论,这里有一个非常近似的方式(可以作为起点),仅使用pandas和matplotlib重新创建seaborn选项:

fig, ax = plt.subplots(nrows=1,ncols=1, figsize=(12,6))
#bp = df.unstack(level=[0,1])['dur'].boxplot(ax=ax, return_type='dict')

bp = df.reset_index().boxplot(column='dur',by=['qdepth','mode'], ax=ax, return_type='dict')['dur']

# Now fill the boxes with desired colors
boxColors = ['darkkhaki', 'royalblue']
numBoxes = len(bp['boxes'])
for i in range(numBoxes):
    box = bp['boxes'][i]
    boxX = []
    boxY = []
    for j in range(5):
        boxX.append(box.get_xdata()[j])
        boxY.append(box.get_ydata()[j])
    boxCoords = list(zip(boxX, boxY))
    # Alternate between Dark Khaki and Royal Blue
    k = i % 2
    boxPolygon = mpl.patches.Polygon(boxCoords, facecolor=boxColors[k])
    ax.add_patch(boxPolygon)

plt.show()

enter image description here