如何从列返回出现次数并将这些值乘以另一列值?

时间:2016-05-26 12:16:24

标签: python-2.7 numpy pandas

我想采用下面的df,通过'USER','TASK'和'STATIC_VALUE'将唯一值组合在一起。我可以使用groupby()执行此操作,但是我在添加'TASK_COUNT'和'TOTALS'列时遇到了问题。 'TOTALS'列将乘以'STATIC_VALUE'*'TASK_COUNT'。我已尝试过groupby(),transform(),size()的多种变体,但我无法实现。建议?谢谢!

数据帧:

    USER    TASK    STATIC_VALUE
1   USER1   TASK2   30
2   USER2   TASK7   12  
3   USER5   TASK4   9
4   USER12  TASK2   30
5   USER2   TASK3   10
6   USER1   TASK2   30
7   USER5   TASK7   12
8   USER1   TASK3   10
9   USER2   TASK3   10

这篇文章让我很接近:

>>> df.groupby(['USER','TASK','STATIC_VALUE']).size()

USER    TASK    STATIC_VALUE    
USER1   TASK2   30              2
        TASK3   10              1
USER2   TASK7   12              1
        TASK3   10              2
USER5   TASK4   9               1
        TASK7   12              1
USER12  TASK2   30              1

预期结果:

USER    TASK    STATIC_VALUE    TASK_COUNT  TOTAL
USER1   TASK2   30              2           60
        TASK3   10              1           10
USER2   TASK7   12              1           12
        TASK3   10              2           20
USER5   TASK4   9               1           9
        TASK7   12              1           12
USER12  TASK2   30              1           30

1 个答案:

答案 0 :(得分:2)

使用GroupBy.size

df1 = df.groupby(['USER','TASK', 'STATIC_VALUE']).size().reset_index(name='TASK_COUNT')
df1['TOTAL'] = df1['TASK_COUNT'] * df1['STATIC_VALUE']
print (df1)
     USER   TASK  STATIC_VALUE  TASK_COUNT  TOTAL
0   USER1  TASK2            30           2     60
1   USER1  TASK3            10           1     10
2  USER12  TASK2            30           1     30
3   USER2  TASK3            10           2     20
4   USER2  TASK7            12           1     12
5   USER5  TASK4             9           1      9
6   USER5  TASK7            12           1     12