如何在熊猫中使用pd.grouper和groupby

时间:2020-04-07 16:13:36

标签: python pandas

这是我的数据框

> Could not find com.zendesk:support:3.0.3.
 Searched in the following locations:
   - file:/Users/user/Library/Android/sdk/extras/m2repository/com/zendesk/support/3.0.3/support-3.0.3.pom
   - file:/Users/user/Library/Android/sdk/extras/m2repository/com/zendesk/support/3.0.3/support-3.0.3.jar
   - file:/Users/user/Library/Android/sdk/extras/google/m2repository/com/zendesk/support/3.0.3/support-3.0.3.pom
   - file:/Users/user/Library/Android/sdk/extras/google/m2repository/com/zendesk/support/3.0.3/support-3.0.3.jar
   - file:/Users/user/Library/Android/sdk/extras/android/m2repository/com/zendesk/support/3.0.3/support-3.0.3.pom
   - file:/Users/user/Library/Android/sdk/extras/android/m2repository/com/zendesk/support/3.0.3/support-3.0.3.jar
   - https://jcenter.bintray.com/com/zendesk/support/3.0.3/support-3.0.3.pom
   - https://jcenter.bintray.com/com/zendesk/support/3.0.3/support-3.0.3.jar
   - file:/Users/user/waytoproj/libs/support-3.0.3.jar
   - file:/Users/user/waytoproj/libs/support.jar
   - https://zendesk.artifactoryonline.com/zendesk/repo/com/zendesk/support/3.0.3/support-3.0.3.pom
   - https://zendesk.artifactoryonline.com/zendesk/repo/com/zendesk/support/3.0.3/support-3.0.3.jar
   - https://maven.fabric.io/public/com/zendesk/support/3.0.3/support-3.0.3.pom
   - https://maven.fabric.io/public/com/zendesk/support/3.0.3/support-3.0.3.jar
   - https://jitpack.io/com/zendesk/support/3.0.3/support-3.0.3.pom
   - https://jitpack.io/com/zendesk/support/3.0.3/support-3.0.3.jar
   - https://linphone.org/releases/maven_repository/com/zendesk/support/3.0.3/support-3.0.3.pom
   - https://linphone.org/releases/maven_repository/com/zendesk/support/3.0.3/support-3.0.3.jar
   - https://dl.google.com/dl/android/maven2/com/zendesk/support/3.0.3/support-3.0.3.pom
   - https://dl.google.com/dl/android/maven2/com/zendesk/support/3.0.3/support-3.0.3.jar
   - https://repo.maven.apache.org/maven2/com/zendesk/support/3.0.3/support-3.0.3.pom
   - https://repo.maven.apache.org/maven2/com/zendesk/support/3.0.3/support-3.0.3.jar

我想按s2PName类别进行分组,并按freq(每月或每周或每天)和agg totsale分组s2Billdate

即,如果我将Billdate与freq分组为每月,那么我的结果df将在“ may”和“ june”两个月中获得“食物”,并将其总销售额相加。

我设法写了一些类似下面的代码,

    S2PName-Category    S2BillDate  totSale
0   Food               2019-05-18   2150.0
1   Beverages          2019-05-19   403.0
2   Food               2019-05-19   7254.0
3   Others             2019-05-19   200.0
4   Juice              2019-05-19   125.0
5   Snacks             2019-05-19   70.0
6   Food               2019-06-21   11932.0

预期的DF输出:

basic_df = basic_df.groupby(['S2PName-Category','S2BillDate'], sort=False)['S2PGTotal'].agg([('totSale','sum')]).reset_index()

在预期的o / p df中,我将s2Billdate设置为该月的最后一天,并设置该月的totSale agg。 我该如何实现?

2 个答案:

答案 0 :(得分:0)

您可以执行以下操作:

In [706]: df                                                                                                                                                                                                
Out[706]: 
    Category    BillDate  totSale
0       Food  2019-05-18   2150.0
1  Beverages  2019-05-19    403.0
2       Food  2019-05-19   7254.0
3     Others  2019-05-19    200.0
4      Juice  2019-05-19    125.0
5     Snacks  2019-05-19     70.0
6       Food  2019-06-21  11932.0

In [710]: df.groupby([df['BillDate'].dt.strftime('%B'), 'Category'])['totSale'].sum()                                                                                                                       
Out[710]: 
BillDate  Category 
June      Food         11932.0
May       Beverages      403.0
          Food          9404.0
          Juice          125.0
          Others         200.0
          Snacks          70.0
Name: totSale, dtype: float64

我相信这就是您想要的。

答案 1 :(得分:0)

basic_df_2 = basic_df.groupby(['S2PName-Category',basic_df['S2BillDate'].dt.to_period('M')], sort=False)['S2PGTotal'].agg([('totSale','sum')]).reset_index()

dt.to_period将有助于处理与频率有关的参数!

相关问题