我有数据集
DaySchedule DayAppointment
2016-04-29 18:38:08 2016-04-29
2016-04-29 16:08:27 2016-04-29
2016-04-26 15:04:17 2016-04-29
我想计算在约会日期和约会日期之间的持续时间,如果他们在同一天,则持续时间为0,否则我将从约会日减去约会日。
def duration_time(x,y):
x= x.dt.date
y= y.dt.date
if x==y:
return 0
else:
return x-y
Patient["duration"] = Patient.apply(lambda Patient:duration_time(Patient["DayAppointment"], Patient["DaySchedule"]), axis=1)
我运行此鳕鱼后出现此错误: AttributeError :(“'Timestamp'对象没有属性'dt'”,你在索引0'发生了')
我知道为什么会收到此错误?
答案 0 :(得分:0)
使用numpy where
+ dt.date
+ sub
代替:
Patient.DaySchedule=pd.to_datetime(Patient.DaySchedule)
Patient.DayAppointment=pd.to_datetime(Patient.DayAppointment)
Patient['duration']=np.where(Patient.DaySchedule.dt.date==Patient.DayAppointment.dt.date, 0, Patient.DaySchedule.sub(Patient.DayAppointment))
DaySchedule DayAppointment Duration
2016-04-29 18:38:08 2016-04-29 0 days 00:00:00
2016-04-29 16:08:27 2016-04-29 0 days 00:00:00
2016-04-26 15:04:17 2016-04-29 -3 days +15:04:17
您也可以获得好日子:
Patient['Duration']=Patient.DaySchedule.sub(Patient.DayAppointment).astype('timedelta64[D]')
DaySchedule DayAppointment Duration
2016-04-29 18:38:08 2016-04-29 0.0
2016-04-29 16:08:27 2016-04-29 0.0
2016-04-26 15:04:17 2016-04-29 -3.0
使用sub,需要:
1.59 ms ± 63.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
使用简单的减法需要几乎几毫秒的时间:
Patient['Duration']=np.where(Patient.DaySchedule.dt.date==Patient.DayAppointment.dt.date, 0, Patient.DaySchedule-Patient.DayAppointment)
2.51 ms ± 172 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)