如何用pandas中的groupby计算绝对和?
例如,给定DataFrame:
Player Score
0 A 100
1 B -150
2 A -110
3 B 180
4 B 125
我希望玩家A的总得分(100 + 110 = 210)以及玩家A的总得分(150 + 180 + 125 = 455),忽略得分的符号。
我可以使用以下代码计算总和:
import pandas as pd
import numpy as np
frame = pd.DataFrame({'Player' : ['A', 'B', 'A', 'B', 'B'],
'Score' : [100, -150, -110, 180, 125]})
print('frame: {0}'.format(frame))
total_scores = frame[['Player','Score']].groupby(['Player']).agg(['sum'])
print('total_scores: {0}'.format(total_scores))
但是如何用groupby计算绝对和?
frame[['Player','Score']].abs().groupby(['Player']).agg(['sum'])
不出所料地回归:
Traceback (most recent call last):
File "O:\tests\absolute_count.py", line 10, in <module>
total_scores = frame[['Player','Score']].abs().groupby(['Player']).agg(['sum'])
File "C:\Users\dernoncourt\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py", line 5518, in abs
return np.abs(self)
TypeError: bad operand type for abs(): 'str'
我不想改变DataFrame。
答案 0 :(得分:5)
您可以应用一个取绝对值的函数,然后对其求和:
>>> frame.groupby('Player').Score.apply(lambda c: c.abs().sum())
Player
A 210
B 455
Name: Score, dtype: int64
您还可以使用绝对值创建一个新列,然后求和:
>>> frame.assign(AbsScore=frame.Score.abs()).groupby('Player').AbsScore.sum()
Player
A 210
B 455
Name: AbsScore, dtype: int64
答案 1 :(得分:1)
您可以将DataFrameGroupBy.apply
与lambda:
In [326]: df.groupby('Player').Score.apply(lambda x: np.sum(np.abs(x)))
Out[326]:
Player
A 210
B 455
Name: Score, dtype: int64
要返回Player
列,请使用df.reset_index
:
In [371]: df.groupby('Player').Score.apply(lambda x: np.sum(np.abs(x))).reset_index()
Out[371]:
Player Score
0 A 210
1 B 455