Pandas将计算列添加到groupby结果中

时间:2018-03-15 02:09:38

标签: python pandas dataframe pandas-groupby

以下python脚本计算以下内容。

  1. 每位客户的总收入报告
  2. 每个客户的报告,显示每个类别的支出有多少。
  3. 我想计算每个报告的销售税组成部分。

    (所有商品的销售税均为9.25%。)

    import pandas as pd
    from io import StringIO
    
    mystr = """Pedro|groceries|apple|1.42
    Nitin|tobacco|cigarettes|15.00
    Susie|groceries|cereal|5.50
    Susie|groceries|milk|4.75
    Susie|tobacco|cigarettes|15.00
    Susie|fuel|gasoline|44.90
    Pedro|fuel|propane|9.60"""
    
    df = pd.read_csv(StringIO(mystr), header=None, sep='|',
                     names=['Name', 'Category', 'Product', 'Sales'])
    
    # Report 1
    rep1 = df.groupby('Name')['Sales'].sum()
    
    # Name
    # Nitin    15.00
    # Pedro    11.02
    # Susie    70.15
    # Name: Sales, dtype: float64
    
    # Report 2
    rep2 = df.groupby(['Name', 'Category'])['Sales'].sum()
    
    # Name   Category 
    # Nitin  tobacco      15.00
    # Pedro  fuel          9.60
    #        groceries     1.42
    # Susie  fuel         44.90
    #        groceries    10.25
    #        tobacco      15.00
    # Name: Sales, dtype: float64
    

1 个答案:

答案 0 :(得分:1)

这可以通过矢量化熊猫计算来实现:

import pandas as pd
from io import StringIO

mystr = """Pedro|groceries|apple|1.42
Nitin|tobacco|cigarettes|15.00
Susie|groceries|cereal|5.50
Susie|groceries|milk|4.75
Susie|tobacco|cigarettes|15.00
Susie|fuel|gasoline|44.90
Pedro|fuel|propane|9.60"""

df = pd.read_csv(StringIO(mystr), header=None, sep='|',
                 names=['Name', 'Category', 'Product', 'Sales'])

# Report 1
rep1 = df.groupby('Name', as_index=False)['Sales'].sum()
rep1['Tax'] = rep1['Sales'] * 0.0925

#     Name  Sales       Tax
# 0  Nitin  15.00  1.387500
# 1  Pedro  11.02  1.019350
# 2  Susie  70.15  6.488875

# Report 2
rep2 = df.groupby(['Name', 'Category'], as_index=False)['Sales'].sum()
rep2['Tax'] = rep2['Sales'] * 0.0925

#     Name   Category  Sales       Tax
# 0  Nitin    tobacco  15.00  1.387500
# 1  Pedro       fuel   9.60  0.888000
# 2  Pedro  groceries   1.42  0.131350
# 3  Susie       fuel  44.90  4.153250
# 4  Susie  groceries  10.25  0.948125
# 5  Susie    tobacco  15.00  1.387500