使用此:
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'Kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
df.groupby(['Team',"Rank"]).sum()
返回。
Points
Team Rank
Devils 2 863
3 673
Kings 1 1544
3 741
4 812
Riders 1 876
2 2173
Royals 1 804
4 701
如何提取等于'1'的值(点数),所以1544 + 876 + 804。 等级为2和3时相同。
答案 0 :(得分:3)
我认为需要DataFrame.xs
:
print (df.xs(1, level=1))
Points
Team
Kings 1544
Riders 876
Royals 804
print (df.xs(2, level=1))
Points
Team
Devils 863
Riders 2173
要按多个条件选择,请使用slicers:
idx = pd.IndexSlice
print (df.loc[idx[:, [1,2]], :])
Points
Team Rank
Devils 2 863
Kings 1 1544
Riders 1 876
2 2173
Royals 1 804
print (df.loc[idx['Riders', [1,2]], :])
Points
Team Rank
Riders 1 876
2 2173
如果希望在Rank
之前将所有群组的总和从['Team',"Rank"]
更改为Rank
:
s = df.groupby("Rank")['Points'].sum()
print (s)
Rank
1 3224
2 3036
3 1414
4 1513
Name: Points, dtype: int64
如果还需要df1
,请按sum
使用level=1
:
df1 = df.groupby(['Team',"Rank"]).sum()
print (df1)
Points
Team Rank
Devils 2 863
3 673
Kings 1 1544
3 741
4 812
Riders 1 876
2 2173
Royals 1 804
4 701
s1 = df1.sum(level=1)
print (s1)
Points
Rank
2 3036
3 1414
1 3224
4 1513
答案 1 :(得分:1)
一个选项
>>> df_group = df.groupby(['Team',"Rank"]).sum().reset_index()
Team Rank Points
0 Devils 2 863
1 Devils 3 673
2 Kings 1 1544
3 Kings 3 741
4 Kings 4 812
5 Riders 1 876
6 Riders 2 2173
7 Royals 1 804
8 Royals 4 701
现在您只需过滤'Rank'
:
>>> df_group.loc[df_group['Rank']==1,'Points']
2 1544
5 876
7 804
另一个选项是再次按Rank进行分组,然后汇总为列表:
>>> df.groupby(['Team','Rank']).sum().reset_index().groupby('Rank')['Points'].agg(lambda x: list(x))
Rank
1 [1544, 876, 804]
2 [863, 2173]
3 [673, 741]
4 [812, 701]
或许你只是想按等级排序,很难分辨,因为你还没有提供所需的输出:
>>> df.groupby(['Team','Rank']).sum().reset_index().sort_values('Rank')
Team Rank Points
2 Kings 1 1544
5 Riders 1 876
7 Royals 1 804
0 Devils 2 863
6 Riders 2 2173
1 Devils 3 673
3 Kings 3 741
4 Kings 4 812
8 Royals 4 701
答案 2 :(得分:1)
df[df['Rank'] == 1] # Filter by rank before summing
答案 3 :(得分:1)
我喜欢使用axis
argument in .loc:
df.groupby(['Team',"Rank"]).sum().loc(axis=0)[:,1]
输出:
Points
Team Rank
Kings 1 1544
Riders 1 876
Royals 1 804
或
df.groupby(['Team',"Rank"]).sum().loc(axis=0)[:,2]
Points
Team Rank
Devils 2 863
Riders 2 2173
或者@Jezrael没有pd.Slicers
:
df.groupby(['Team',"Rank"]).sum().loc(axis=0)[:,[1,2]]
Points
Team Rank
Devils 2 863
Kings 1 1544
Riders 1 876
2 2173
Royals 1 804
答案 4 :(得分:1)
您可以在求和后按等级重新排序:
import pandas as pd
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'Kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
result = df.groupby(['Team', 'Rank']).sum().swaplevel().sort_index()
# Or just:
result = df.groupby(['Rank', 'Team']).sum()
print(result)
输出:
Rank Team
1 Kings 1544
Riders 876
Royals 804
2 Devils 863
Riders 2173
3 Devils 673
Kings 741
4 Kings 812
Royals 701
答案 5 :(得分:1)
您可以尝试将Mo, Tu, We, Th, Fr, Sa 10:00-18:00 Su 12:00-17:00
中的列交换为groupby
:
["Rank", "Team"]
结果:
grouped = df.groupby(["Rank", "Team"]).sum()
print(grouped)
然后,要获得任何等级的总和,您可以使用 Points
Rank Team
1 Kings 1544
Riders 876
Royals 804
2 Devils 863
Riders 2173
3 Devils 673
Kings 741
4 Kings 812
Royals 701
。对于例如第一等级将是:
loc
结果:
grouped.loc[1].Points.sum()