根据列值在多索引列数据框上执行功能

时间:2019-06-24 19:05:14

标签: python pandas dataframe multi-index

我有一个看起来像这样的数据框:

                   |    PACKAGES SHIPPED     |    PACKAGES TRANSFERRED   |
Product & Quantity | Apple-5 pk | Apple-5 pk | Apple-5 pk  |  Apple-5pk  |
Store Branch I.D.  |  34234324  |  34235555  |  34234324   |  34235555   |
----------------------------------------------------------------------------
   Period Week     
   5/14 - 5/20     |     5      |     10     |     20      |     7       |
   5/21 - 5/27     |     40     |      X     |      1      |     Y       |

此数据框具有“已包装的包裹”的多列标题,其中许多商店分支都将具有“已包装的包裹”。

如果我想针对特定的“产品和数量”值以及特定的“商店和分支机构ID”求和“已运送的包裹”和“已转移的包裹”,那么对于每个期间周,最有效的方法是做这个?

理想的结果数据框为:

                   |Sum Shipped & Transferred|Sum Shipped & Transferred  |                     
Product & Quantity |       Apple-5 pk        |         Apple-10 pk       |
Store Branch I.D.  |  34234324  |  34235555  |  34234324   |  34235555   |
----------------------------------------------------------------------------
   Period Week     
   5/14 - 5/20     |     25     |     17     |     40      |     234     |
   5/21 - 5/27     |     41     |     X+Y    |     34      |      25     |

1 个答案:

答案 0 :(得分:0)

考虑将其表示为数据框而不是图片可能会有所帮助。这是考虑问题的一种简单方法。当然,如果您确实按图片所示将数据存储在多列索引中,那么这将毫无帮助。

In [33]: df = pd.DataFrame({'Period Week':['5/14 - 5/20','5/21 - 5/27','5/14 - 5/20','5/21 - 5/27'],'Transaction':['Shi
    ...: pped','Shipped','Transfered','Transfered'],'Package SKU':['Apples-5k','Apples-10k','Apples-5k','Apples-10k'],'
    ...: Quantity':[5,10,20,7]})

In [34]: df
Out[34]:
   Period Week Transaction Package SKU  Quantity
0  5/14 - 5/20     Shipped   Apples-5k         5
1  5/21 - 5/27     Shipped  Apples-10k        10
2  5/14 - 5/20  Transfered   Apples-5k        20
3  5/21 - 5/27  Transfered  Apples-10k         7

然后将索引设置为多列:

df.set_index(['Period Week','Transaction','Package SKU'])

最后,groupby和calc

In [35]: df.groupby(['Period Week','Package SKU'])['Quantity'].count()
Out[35]:
Period Week  Package SKU
5/14 - 5/20  Apples-5k      2
5/21 - 5/27  Apples-10k     2
Name: Quantity, dtype: int64