对于每个NBA球队,我的首场比赛日期和最后一场比赛都有1个DF。我在每场比赛前后都有另外一个DF w /球队的ELO。我想在团队的ELO以及指定的第一个和最后一个日期向DF1添加2列。对于第一列中的日期,我想要ELO1,第二列中的日期我想要ELO2。如果有某种方法可以将2个ELO的差直接变成1列,那会更好,因为这最终我将要计算。
DF1:
first last
team
ATL 2017-10-18 2018-04-10
BOS 2017-10-17 2018-04-11
BRK 2017-10-18 2018-04-11
CHI 2017-10-19 2018-04-11
[...]
DF2:
date team ELO_before ELO_after
65782 2017-10-18 ATL 1648.000000 1650.308911
65783 2017-10-17 BOS 1761.000000 1753.884111
65784 2017-10-18 BRK 1427.000000 1439.104231
65785 2017-10-19 CHI 1458.000000 1464.397752
65786 2018-04-10 ATL 1406.000000 1411.729285
[...]
预先感谢!
编辑-我想要的结果数据框如下所示:
DF3:
first last ELO_before ELO_after
team
ATL 2017-10-18 2018-04-10 1648.000000 1411.729285
BOS 2017-10-17 2018-04-11 1761.000000 [Elo2 for last game]
BRK 2017-10-18 2018-04-11 1427.000000 [Elo2 for last game]
CHI 2017-10-19 2018-04-11 1458.000000 [Elo2 for last game]
答案 0 :(得分:1)
您可以使用pandas.DataFrame.merge
:
import pandas as pd
# frames from the question
df1 = pd.DataFrame(data={
'team': ['ATL', 'BOS', 'BRK', 'CHI'],
'first': ['2017-10-18', '2017-10-17', '2017-10-18', '2017-10-19'],
'last': ['2018-04-10', '2018-04-11', '2018-04-11', '2018-04-11']
}).set_index('team')
df2 = pd.DataFrame(data={
'date': ['2017-10-18', '2017-10-17', '2017-10-18', '2017-10-19', '2018-04-10'],
'team': ['ATL', 'BOS', 'BRK', 'CHI', 'ATL'],
'ELO_before': [1648.0, 1761.0, 1427.0, 1458.0, 1406.0],
'ELO_after': [1650.308911, 1753.884111, 1439.104231, 1464.397752, 1411.729285]
})
# merge on first and last
df1.reset_index(inplace=True)
df3 = df1.merge(df2.drop('ELO_after', axis=1), how='left', left_on=['team', 'first'], right_on=['team', 'date']).drop(['date'], axis=1)
df3 = df3.merge(df2.drop('ELO_before', axis=1), how='left', left_on=['team', 'last'], right_on=['team', 'date']).drop(['date'], axis=1)
# calculate the differences
df3['ELO_difference'] = df3['ELO_after'] - df3['ELO_before']
df3.set_index('team', inplace=True)