我想要组合2个数据帧。第一个看起来如下:
Date HomeTeam AwayTeam
0 06/01/14 Real Madrid Celta Vigo
1 06/01/14 Celta Vigo Valencia
第二个看起来像这样:
EVENT_ID HomeTeam AwayTeam SELECTION ODDS
0 112324699 Real Madrid Celta Vigo Celta Vigo 47.50
1 112324699 Real Madrid Celta Vigo Real Madrid 1.13
2 112324699 Real Madrid Celta Vigo The Draw 16.00
3 112369682 Celta Vigo Valencia Celta Vigo 3.30
4 112369682 Celta Vigo Valencia The Draw 3.55
5 112369682 Celta Vigo Valencia Valencia 2.43
所以基本上在第二个数据帧中,一个匹配有3行,一个用于任一团队,一个用于绘制(SELECTION)及其相应的赔率(ODDS)。
我现在要做的是将第二个数据帧中的赔率信息添加到第一个数据帧,所以我想最终得到以下内容:
Date HomeTeam AwayTeam OddsHome OddsDraw OddsAway
0 06/01/14 Real Madrid Celta Vigo 1.13 16.00 47.50
1 06/01/14 Celta Vigo Valencia 3.30 3.55 2.43
我尝试编写并应用查找功能,但失败了 也许你可以帮助我?
答案 0 :(得分:3)
我会将df2重塑为new_df2,看起来像这样
df2['SELECTION'] = np.where(df2['SELECTION'] == df2['HomeTeam'], 'Home', np.where(df2['SELECTION'] == df2['AwayTeam'],'Away', 'Draw'))
new_df2 = df2.set_index(['EVENT_ID','HomeTeam','AwayTeam','SELECTION']).unstack().reset_index()
new_df2.columns = new_df2.columns.map(''.join)
EVENT_ID HomeTeam AwayTeam ODDSAway ODDSDraw ODDSHome
0 112324699 Real Madrid Celta Vigo 47.50 16.00 1.13
1 112369682 Celta Vigo Valencia 2.43 3.55 3.30
现在使用合并
df1.merge(new_df2, on = ['HomeTeam', 'AwayTeam']).drop('EVENT_ID', axis = 1)
你得到了
Date HomeTeam AwayTeam ODDSAway ODDSDraw ODDSHome
0 06/01/14 Real Madrid Celta Vigo 47.50 16.00 1.13
1 06/01/14 Celta Vigo Valencia 2.43 3.55 3.30
答案 1 :(得分:1)
另一种解决方案:
df2=df2.merge(df1,on=['HomeTeam','AwayTeam'],how='left')
df2['SELECTION']=df2.groupby('EVENT_ID').apply(lambda x : x.SELECTION.replace({x.HomeTeam.values[0]:'Home',x.AwayTeam.values[0]:'Away'})).values
df2.set_index(['HomeTeam','AwayTeam','Date','SELECTION']).ODDS.unstack().reset_index()
Out[1151]:
SELECTION HomeTeam AwayTeam Date Away Home TheDraw
0 CeltaVigo Valencia 06/01/14 2.43 3.30 3.55
1 RealMadrid CeltaVigo 06/01/14 47.50 1.13 16.00