Python根据日期范围计算新日期

时间:2019-11-13 21:58:52

标签: python-3.x pandas dataframe python-datetime

我有一个Python Pandas DataFrame,其中包含曲棍球运动员的出生日期,如下所示:

Player         Birth Year    Birth Date
Player A         1990        1990-05-12
Player B         1991        1991-10-30
Player C         1992        1992-09-10
Player D         1990        1990-11-15

我要创建一个新标签为“ Draft Year”的列,该列是根据以下规则计算的:

If MM-DD is before 09-15, Draft Year = Birth Year + 18
Else if MM-DD is after 09-15 Draft Year = Birth Year + 19

这将产生示例的输出:

Player         Birth Year    Birth Date     Draft Year
Player A         1990        1990-05-12      2008
Player B         1991        1991-10-30      2010
Player C         1992        1992-09-10      2010
Player D         1990        1990-11-15      2009

我尝试使用

将MM-DD与日期格式分开
Data['Birth Date'] = Data['Birth Date'].str.split('-').str[1:]

但是,这给了我一个[mm,dd]的列表,很难使用。关于如何做到这一点的任何建议将不胜感激!

3 个答案:

答案 0 :(得分:2)

使用numpy.where

data['Birth Date']=pd.to_datetime(data['Birth Date']) #to convert to datetime
cond=(df['Birth Date'].dt.month>=9)&(df['Birth Date'].dt.day>=15)
cond2=(df['Birth Date'].dt.month>=10)
data['Draft Year']=np.where(cond|cond2,data['Birth Year']+19,data['Birth Year']+18)

print(data)

输出

    Player  Birth Year Birth Date  Draft Year
0  PlayerA        1990 1990-05-12        2008
1  PlayerB        1991 1991-10-30        2010
2  PlayerC        1992 1992-09-10        2010
3  PlayerD        1990 1990-11-15        2009

答案 1 :(得分:1)

又快又脏

将一列设为100 *月,并将其添加到当天

cutoff = df['Birth Date'].pipe(lambda d: d.dt.month * 100 + d.dt.day)
df['Draft Year'] = df['Birth Year'] + 18 + (cutoff > 915)

df

     Player  Birth Year Birth Date  Draft Year
0  Player A        1990 1990-05-12        2008
1  Player B        1991 1991-10-30        2010
2  Player C        1992 1992-09-10        2010
3  Player D        1990 1990-11-15        2009

答案 2 :(得分:1)

日期时间yyyy-mm-dd形式的字符串可以排序。该解决方案利用了这一事实:

df['Draft Year'] = df['Birth Year'] + np.where(df['Birth Date'].dt.strftime('%m-%d') < '09-15', 18, 19)