df1 =
Date Team1 Team2
6/1 Boston New York
6/13 New York Boston
6/27 Boston New York
我正在尝试计算自波士顿上次出现在任一列以来的天数,但我只能使用df1 ['波士顿踢球后的天数'] = df1来计算如何在一个列中查找该天数。 groupby('Team1')['Date']。diff()。fillna(0)
我希望输出为:
Date Team1 Team2 Days since Boston played
6/1 Boston New York 0
6/13 New York Boston 12
6/27 Boston New York 14
编辑-扩展数据框,以了解如何将其应用于所有团队,而不仅仅是一个团队 我希望输出为:
Date Team1 Team2 Days since **Team1** played
6/1 Boston New York 0
6/13 New York Chicago 12
6/27 Boston New York 14
6/28 Chicago Boston 15
答案 0 :(得分:2)
使用isin检查team1或team2中是否存在Boston并找到timedelta
df['Date'] = pd.to_datetime(df['Date'], format = '%m/%d')
df.loc[df.isin(['Boston']).any(1),'Days since Boston played'] = df.loc[df.isin(['Boston']).any(1), 'Date'].diff().dt.days
Date Team1 Team2 Days since Boston played
0 1900-06-01 Boston New York NaN
1 1900-06-13 New York Boston 12.0
2 1900-06-27 Boston New York 14.0
如果您希望日期列恢复为原始格式,则可以使用strftime
df['Date'] = df['Date'].dt.strftime('%m/%d')
Date Team1 Team2 Days since Boston played
0 06/01 Boston New York NaN
1 06/13 New York Boston 12.0
2 06/27 Boston New York 14.0
答案 1 :(得分:0)
您可以在Team1上进行分组,然后在日期上进行区别:
# Note: you should post a creatable example in your post next time
data = {
'Date': ['2018-06-01', '2018-06-13', '2018-06-27'],
'Team1':['Boston', 'New York', 'Boston'],
'Team2':['New York', 'Boston', 'New York']
}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
df['Time between games'] = df.groupby('Team1')['Date'].diff().fillna(0)
这实际上将为您提供所有主队比赛之间的区别。