如何使用matplotlib在python中通过group和facet grid绘制具有不同颜色的直方图

时间:2017-09-15 15:17:29

标签: python r python-3.x pandas matplotlib

我有以下数据:

import pandas as pd

data = pd.DataFrame({"group":   ["aa", "aa", "aa", "aa", "bb", "bb", "bb", "bb"],
                     "segment": ["da", "et", "da", "et", "da", "et", "da", "et"],
                     "country": ["br", "br", "th", "th", "br", "br", "th", "th"],
                     "N":       [31, 23, 17, 9, 4, 100, 10, 20],
                     "totalN":  [84, 84, 389, 389, 84, 84, 389, 389]}
                    )

我想在 python 中使用与以下 R 代码

生成的相同的情节
ggplot(data, aes(x=segment, y=N, fill=group)) +
geom_bar(stat="identity") +
ggtitle("group") +
facet_grid(country~.)+
geom_text(aes(label=percent(round(N / totalN, 2))), position=position_stack(vjust=0.5), size=3) +
coord_flip()

我试过了

data_groupped = data.groupby(['group', 'segment'])
data_groupped.plot(x='segment', y='N', kind='hist')

它分别生成每个直方图。

所以预期的输出是这样的:

enter image description here

1 个答案:

答案 0 :(得分:3)

使用pandas图,你可以做

选项1] 使用pivot_table重塑groups

的数据
import matplotlib.pyplot as plt

groups = data.groupby('country')
fig, axes = plt.subplots(groups.ngroups,sharex=True)
for (g, grp), ax in zip(groups, axes.flatten()):
    grp_df = grp.pivot_table(index='segment', columns='group', values='N', aggfunc=np.sum)
    grp_df.plot.barh(stacked=True, ax=ax, sharex=True)

enter image description here

选项2] 首先将数据重新整理为df,然后使用plot

df = (data.groupby('country')
        .apply(lambda x: x.groupby(['segment', 'group'])['N'].sum().unstack())
        .unstack(level=0)
        .reorder_levels((1,0), axis=1)
        .sort_index(axis=1)
)
cgroups = df.groupby(level=0, axis=1)
fig, axes = plt.subplots(cgroups.ngroups, sharex=True)
for (c, grp), ax in zip(cgroups, axes.flatten()):
    sp = grp[c].plot.barh(stacked=True, ax=ax, sharex=True)

enter image description here

df

enter image description here

选项3] 如果您不需要分离子图

df = (data.groupby('country')
        .apply(lambda x: x.groupby(['segment', 'group'])['N'].sum().unstack()))
df.plot.barh(stacked=True)

enter image description here

df

enter image description here