我有一个Python Pandas DataFrame,其中包含曲棍球运动员的出生日期,如下所示:
Player Birth Year Birth Date
Player A 1990 1990-05-12
Player B 1991 1991-10-30
Player C 1992 1992-09-10
Player D 1990 1990-11-15
我要创建一个新标签为“ Draft Year”的列,该列是根据以下规则计算的:
If MM-DD is before 09-15, Draft Year = Birth Year + 18
Else if MM-DD is after 09-15 Draft Year = Birth Year + 19
这将产生示例的输出:
Player Birth Year Birth Date Draft Year
Player A 1990 1990-05-12 2008
Player B 1991 1991-10-30 2010
Player C 1992 1992-09-10 2010
Player D 1990 1990-11-15 2009
我尝试使用
将MM-DD与日期格式分开Data['Birth Date'] = Data['Birth Date'].str.split('-').str[1:]
但是,这给了我一个[mm,dd]的列表,很难使用。关于如何做到这一点的任何建议将不胜感激!
答案 0 :(得分:2)
使用numpy.where
:
data['Birth Date']=pd.to_datetime(data['Birth Date']) #to convert to datetime
cond=(df['Birth Date'].dt.month>=9)&(df['Birth Date'].dt.day>=15)
cond2=(df['Birth Date'].dt.month>=10)
data['Draft Year']=np.where(cond|cond2,data['Birth Year']+19,data['Birth Year']+18)
print(data)
输出
Player Birth Year Birth Date Draft Year
0 PlayerA 1990 1990-05-12 2008
1 PlayerB 1991 1991-10-30 2010
2 PlayerC 1992 1992-09-10 2010
3 PlayerD 1990 1990-11-15 2009
答案 1 :(得分:1)
将一列设为100 *月,并将其添加到当天
cutoff = df['Birth Date'].pipe(lambda d: d.dt.month * 100 + d.dt.day)
df['Draft Year'] = df['Birth Year'] + 18 + (cutoff > 915)
df
Player Birth Year Birth Date Draft Year
0 Player A 1990 1990-05-12 2008
1 Player B 1991 1991-10-30 2010
2 Player C 1992 1992-09-10 2010
3 Player D 1990 1990-11-15 2009
答案 2 :(得分:1)
日期时间yyyy-mm-dd
形式的字符串可以排序。该解决方案利用了这一事实:
df['Draft Year'] = df['Birth Year'] + np.where(df['Birth Date'].dt.strftime('%m-%d') < '09-15', 18, 19)