Question

我希望标题足够准确，我不太清楚如何用它来表达。

无论如何，我的问题是我有一个Pandas df，如下所示：

                              Customer       Source  CustomerSource
0                                Apple            A             141
1                                Apple            B              36
2                            Microsoft            A             143
3                               Oracle            C             225
4                                  Sun            C             151

这是一个来自更大数据集的df，CustomerSource的值意味着它是所有Customer和Source出现的累积总和，例如，在这种情况下，有Apple Soure A和Customer Oracle的225 Source B } 等等。

我想要做的是，我想做一个堆叠的条形图，它给我x轴上的所有Customer和CustomerSource的值叠在一起y轴。与下面的例子类似。关于我将如何处理此事的任何提示？

Answer 1

您可以使用pivot或unstack进行重塑，然后使用DataFrame.bar：

df.pivot('Customer','Source','CustomerSource').plot.bar(stacked=True)

df.set_index(['Customer','Source'])['CustomerSource'].unstack().plot.bar(stacked=True)

如果成对重复Customer，Source使用pivot_table或groupby使用汇总sum：

print (df)
    Customer Source  CustomerSource
0      Apple      A             141 <-same Apple, A
1      Apple      A             200 <-same Apple, A
2      Apple      B              36
3  Microsoft      A             143
4     Oracle      C             225
5        Sun      C             151

df = df.pivot_table(index='Customer',columns='Source',values='CustomerSource', aggfunc='sum')
print (df)
Source         A     B      C
Customer                     
Apple      341.0  36.0    NaN <-141 + 200 = 341
Microsoft  143.0   NaN    NaN
Oracle       NaN   NaN  225.0
Sun          NaN   NaN  151.0


df.pivot_table(index='Customer',columns='Source',values='CustomerSource', aggfunc='sum')
  .plot.bar(stacked=True)

df.groupby(['Customer','Source'])['CustomerSource'].sum().unstack().plot.bar(stacked=True)

也可以交换列：

df.pivot('Customer','Source','CustomerSource').plot.bar(stacked=True)

df.pivot('Source', 'Customer','CustomerSource').plot.bar(stacked=True)

绘制列值的出现次数

1 个答案: