数据框是这样的:
ID Start_dt
1 10/14/2018
1 10/24/2018
2 7/12/2018
我想从当前行和上一行找到最大日期,即
df.Start_dt.rolling(window=1).max().shift(1).fillna(datetime.timedelta(0),unit='days')
我得到的错误是ops在滚动时未实现。
输出类似:
ID Start_dt New_col
1 10/14/2018 NAN
1 10/24/2018 10/24/2018
2 7/12/2018 10/24/2018
答案 0 :(得分:1)
IIUC,您可以使用Series.rolling.max
:
dts = pd.to_datetime(df['Start_dt'], errors='coerce')
df['New_col'] = (
pd.to_datetime(dts.astype(int).rolling(2).max()).dt.strftime('%m/%d/%Y'))
ID Start_dt New_col
0 1 10/14/2018 NaT
1 1 10/24/2018 10/24/2018
2 2 7/12/2018 10/24/2018
答案 1 :(得分:0)
df['Start_dt']=pd.to_datetime(df['Start_dt'])
m=df['Start_dt'] > df['Start_dt'].shift()
df['new_col']=np.where(m,df['Start_dt'],df['Start_dt'].shift())
print(df)
# ID Start_dt new_col
#0 1 2018-10-14 NaT
#1 1 2018-10-24 2018-10-24
#2 2 2018-07-12 2018-10-24