我有一个groupby对象,我将扩展均值应用于。但是,我希望同时计算另一个系列/组。这是我的代码:
if, "", a, ==, "", b, "", b, =, "", c
如何在d = { 'home' : ['A', 'B', 'B', 'A', 'B', 'A', 'A'],
'away' : ['B', 'A','A', 'B', 'A', 'B', 'B'],
'aw' : [1,0,0,0,1,0,np.nan],
'hw' : [0,1,0,1,0,1, np.nan]}
df2 = pd.DataFrame(d, columns=['home', 'away', 'hw', 'aw'])
df2['tie'] = np.where(df2.hw == df2.aw, 1, 0)
df2.index = range(1,len(df2) + 1)
avgcol = ['hw','tie','aw']
homenames = ['home_win_at_home', 'home_tie_at_home', 'home_loss_at_home']
awaynames = ['away_win_at_away', 'away_tie_at_away', 'away_loss_at_away']
def win_at_venue(df, venuecol, avgcol, name):
df[name] = df.groupby('away')[avgcol].apply(lambda x:pd.expanding_mean(x).shift())
win_at_venue(df2, 'home', avgcol, homenames)
win_at_venue(df2, 'away', avgcol[::-1], awaynames)
对象中使用pd.expanding_mean
,对groupby
和'home'
列进行平均,以便我看到他们在所有场地的平均胜利/关系/损失?现在它只给出了一个在主场或客场比赛的球队的先前平均胜利,而不是家庭和球场。程。
我一直在尝试不同级别和df.stack()并重新索引但没有运气。
任何得到帮助的帮助。
以下是家庭和家庭赢得所有场地的正确结果:
'away'
答案 0 :(得分:1)
您可能需要介绍一个'团队' column
无论场地如何,都要跟随球队的记录。下面可以让你更接近。从:
d = {'home': ['A', 'B', 'B', 'A', 'B', 'A', 'A'],
'away': ['B', 'A', 'A', 'B', 'A', 'B', 'B'],
'aw': [1, 0, 0, 0, 1, 0, np.nan],
'hw': [0, 1, 0, 1, 0, 1, np.nan]}
df = pd.DataFrame(d, columns=['home', 'away', 'hw', 'aw'])
df.index = range(1, len(df) + 1)
df.index.name = 'game'
获得:
home away hw aw
0 A B 0 1
1 B A 1 0
2 B A 0 0
3 A B 1 0
4 B A 0 1
5 A B 1 0
6 A B NaN NaN
df.index = range(1, len(df) + 1)
df.index.name = 'game'
home away hw aw
game
1 A B 0 1
2 B A 1 0
3 B A 0 0
4 A B 1 0
5 B A 0 1
6 A B 1 0
7 A B NaN NaN
接下来,堆叠,以便您可以关注每个团队:
df = df.set_index(['hw', 'aw'], append=True).stack().reset_index().rename(columns={'level_3': 'role', 0: 'team'}).loc[:,
['game', 'team', 'role', 'hw', 'aw']]
game team role hw aw
0 1 A home 0 1
1 1 B away 0 1
2 2 B home 1 0
3 2 A away 1 0
4 3 B home 0 0
5 3 A away 0 0
6 4 A home 1 0
7 4 B away 1 0
8 5 B home 0 1
9 5 A away 0 1
10 6 A home 1 0
11 6 B away 1 0
12 7 A home NaN NaN
13 7 B away NaN NaN
然后,定义“胜利”,计算总体记录并应用expanding_mean
:
def wins(row):
if row['role'] == 'home':
return row['hw']
else:
return row['aw']
df['wins'] = df.apply(wins, axis=1)
df['expanding_mean'] = df.groupby('team')['wins'].apply(lambda x: pd.expanding_mean(x).shift())
game team role hw aw wins expanding_mean
0 1 A home 0 1 0 NaN
1 1 B away 0 1 1 NaN
2 2 B home 1 0 1 1.000000
3 2 A away 1 0 0 0.000000
4 3 B home 0 0 0 1.000000
5 3 A away 0 0 0 0.000000
6 4 A home 1 0 1 0.000000
7 4 B away 1 0 0 0.666667
8 5 B home 0 1 0 0.500000
9 5 A away 0 1 1 0.250000
10 6 A home 1 0 1 0.400000
11 6 B away 1 0 0 0.400000
12 7 A home NaN NaN NaN 0.500000
13 7 B away NaN NaN NaN 0.333333
由于您有游戏和团队的参考资料,您可以merge
和filter
来获得首选版面。