根据另一个列条件以及总和和计数按列值分组

时间:2020-03-05 02:40:14

标签: python pandas

我想将策略设置为变量,以便输入所需的任何策略。按节目分组,计算出现的节目数,汇总观看次数并汇总收入。我该如何实现?

我的桌子看起来

policy.    show.    views.  revenue
10 min.    batman.   100.     10
10 min     batman.   200.     20
5 min.     joker.    100.     10
5 min      joker.    300.     15
15 min.    superman. 500.     30

我的预期输出是

政策=“ 10分钟”

Show       count    total_views    total_revenue
batman.    2.        300.            30

如果我给策略=“ 5分钟”,我的输出应为

 Show       count    total_views    total_revenue
 joker.     2.        400.            25

与其他任何政策类似,我在可变政策下

1 个答案:

答案 0 :(得分:1)

这可以帮助您:

def set_policy(df, policy):
    filtered = df[df['policy'] == policy]
    t = {'show': filtered['show'].unique()[0], 'count': filtered.shape[0],
         'total_views': filtered['views'].sum(), 'total_revenue': filtered['revenue'].sum()}
    return pd.DataFrame([t])

df = set_policy(df, '10min')

输出:

     show  count  total_views  total_revenue
0  batman      2          300             30

更新

示例数据框

  policy      show  views  revenue
0  10min    batman    100       10
1  10min    batman    200       20
2   5min     joker    100       10
3   5min     joker    300       15
4  15min  superman    500       30
5  10min  superman    100       20

代码:

def set_policy(df, policy):
    t = defaultdict(list)
    filtered = df[df['policy'] == policy]
    gp = filtered.groupby('show')
    for i, k in gp:
        t['show'].append(k['show'].unique()[0])
        t['count'].append(k.shape[0])
        t['total_views'].append(k['views'].sum())
        t['total_revenue'].append(k['revenue'].sum())
    return pd.DataFrame(t)

df = set_policy(df, '10min')

输出

       show  count  total_views  total_revenue
0    batman      2          300             30
1  superman      1          100             20