我试图根据日期映射两个数据框。但是,我遇到了如下错误:
“ InvalidIndexError:重新索引仅对唯一值索引有效 对象”
我正在使用以下 df1 并创建一个新列“ Fix Week”
kickoffDate kickoffTime hometeam_team1
2016-08-13 11:30:00 Hull City
2016-08-13 14:00:00 Middlesbrough
2016-08-13 14:00:00 Middlesbrough
2016-08-13 14:00:00 Middlesbrough
2016-08-13 14:00:00 Middlesbrough
我要映射的 df2 如下:
Round Date Home Team Away Team
1 2016-08-13 Hull Leicester
1 2016-08-13 Burnley Swansea
1 2016-08-13 Crystal Palace West Brom
1 2016-08-13 Everton Spurs
要获取新列,我使用以下代码:
df1['fix'] = df1.kickoffDate.map(df2.set_index('Date').Round).astype(float)
但是如上所述,它给了我错误。
有人会劝我吗?
谢谢
Zep
答案 0 :(得分:1)
您的Date
值在df2
中重复存在问题。
因此,需要首先删除唯一的Date
行的重复对象:
df2 = df2.drop_duplicates('Date')
print (df2)
Round Date Home Team Away Team
0 1 2016-08-13 Hull Leicester
df1['fix'] = df1.kickoffDate.map(df2.set_index('Date').Round).astype(float)
print (df1)
kickoffDate kickoffTime hometeam_team1 fix
0 2016-08-13 11:30:00 Hull City 1.0
1 2016-08-13 14:00:00 Middlesbrough 1.0
2 2016-08-13 14:00:00 Middlesbrough 1.0
3 2016-08-13 14:00:00 Middlesbrough 1.0
4 2016-08-13 14:00:00 Middlesbrough 1.0