假设我有一个df-
Player Challenge Description
James ABC Desc1
Bob ABC Desc1
Bob XYZ Desc X
Bob ABX101 Desc4
Alex XYZ Desc X
Mark ABC123 Desc 123
Jessica ABC123 Desc 123
Lynn XYZ Desc X
Bob ABX101 Desc4
Alex ABX101 Desc 4
Mark ABC Desc 1
Lynn ABC Desc 1
Mark POQ Desc 3
Mark XYZ Desc X
Mark ABC Desc 1
我可以按玩家分组并使用groupby
-
df.groupby(by=['Player', 'Challenge'])
但是如何获得每个玩家的挑战计数(可能在下一列),然后平均每个玩家的挑战呢?
答案 0 :(得分:2)
使用:
count_challenge=df.groupby('Player').Challenge.count()
print(count_challenge)
Player
Alex 2
Bob 4
James 1
Jessica 1
Lynn 2
Mark 5
Name: Challenge, dtype: int64
如果您不希望计算重复次数:
count_challenge=df.drop_duplicates(['Challenge','Player']).groupby('Player').Challenge.count()
print(count_challenge)
Player
Alex 2
Bob 3
James 1
Jessica 1
Lynn 2
Mark 4
Name: Challenge, dtype: int64
然后您可以计算平均值:
count_challenge.mean()
如果您想为每个玩家挑战每种类型的挑战
count_differents_challenge=df.groupby('Player').Challenge.value_counts()
print(count_differents_challenge)
Player Challenge
Alex ABX101 1
XYZ 1
Bob ABX101 2
ABC 1
XYZ 1
James ABC 1
Jessica ABC123 1
Lynn ABC 1
XYZ 1
Mark ABC 2
ABC123 1
POQ 1
XYZ 1
Name: Challenge, dtype: int64
答案 1 :(得分:1)
您可以尝试使用pivot:
df.pivot(index='foo', columns='bar')